T 


i 


'V 


n 


/ - 




* < 




% 

V 


X 


t 


A 

■ 

f 






> 




* < 


y 


* 


* • -* 


* 

r 

-» A 


■ar 




. *-. 


'■%-■ • 

/ 


<** \ 




S' 


-. i 


'•! 

v ’ V/ > 1 


' f 

A* 


A- 


% „ 

%■ 






’#■ v 




t 

t 


« • 


#v 


■ 

* 


% / 


v 


^ • '. « ***•' 


*** 




*. 

Jr 




♦ 


;•■ 


vf* 


v 


* 




J 


> 


* i 


>«* 


* ,% 


•» \” ■ 




*» 


r 














A 




? 


, It 

* 


7 




SAMPLING 

HEORY 



(esraj 

jampling Expert, serving under the 

hited Nations Program of Technical Assistance 


§0 


TATA McGRAW-HILL PUBLISHING COMPANY LTD. 

New Delhi 





SAMPLING THEORY 


^Copyright © 1968, by McGraw-Hill, Inc. 

. CJUMyn 

All Rights Reserved. % 

No part of this publication may be reproduced, stored In 

a retrieval system, or transmitted, in any form or by 
any means, electronic, mechanical, photocopying, 
recording, or otherwise, without the prior 
written permission of the publisher. 


TMH Edition 1971 

Reprinted 1976 

Reprinted 1978 


_ J 

Reprinted in indie by errangament with McGraw-Hill, Inc. 

New York. 


This edition can be exported from India only by the Publishers, 
Tata McGraw-Hill Publishing Company Ltd. 


Published by Tata McGraw-Hill Publishing Company Limited and 
Printed by Mohan Makhijani at Rekha Printers Pvt. Ltd., New Delhi-110020 



to PUSHPA. 





Foreword 



We are living in a world in which most countries are making strenuous 
efforts to raise the living standards of their people In order to achieve 
balanced development, carefully worked-out plans are drawn up and exe¬ 
cuted as far as possible. To formulate these plans in a scientific manner, 
it is essential to have basic facts in numerical terms for the various regions 
in the country and for the country as a whole. 

It is beyond the resources of smaller countries to collect facts year after 
year from each person, establishment, or farm in the country. Fortunately, 
as we know now, it is not essential to enumerate each unit in the universe 
in order to arrive at an acceptable figure for the total. A carefully designed 
sample can provide the necessary information for guidelines that a country 
needs, at a cost the country may well afford. 

The Statistical Office of the United Nations has been deeply concerned, 
since its organization in 1946 , with ways and means to assist national govern¬ 
ments to obtain the statistical data so indispensable for planning economic 



and social development for checking on current implementation of pro. 
grams, and for assessing results. 

In this worR, the United Nations has been assisted in two important ways. 
First, it has been aided by the United Nations Statistical Commission, which 
initiated and maintained a strong impetus toward the promotion and elabo- 
ration of sampling methods and toward the establishment of sampling 
offices in national governments. The Commission was greatly assisted in 
their efforts by the work of its Subcommission on Statistical Sampling, 
which recommended principles which could guide the development of suita¬ 
ble methodology in the developing countries. The composition of the Sub¬ 
commission was enough tb ensure the highest professional level of recom¬ 
mendations. Sir Ronald Fisher, Professor P. C. Mahalanobis, Dr. W. E. 
Deming, Dr. F. Yates, and Professor G. Darmois gave unstintingly of their 
time in elaborating the basic principles. In this work they were frequently 
joined by other distinguished experts. 

Second, and this has been equally important in the development of national 
statistical systems, is the application of sampling methods to the practical 
problems encountered by developing countries. These countries generally 
had no long tradition of censuses, or similar periodic compilations, to use 
as a framework for sampling. The application of sampling methods that 

would produce acceptable results in such situations therefore required the 
utmost ingenuity. 

Furthermore, the United Nations was fortunate in having the services of 
such experts as Dr. Des Raj under the auspices of the United Nations 
Technical Assistance Programs for practical field assignments. Dr. Des i 
Raj has performed distinguished service in adapting theoretical principles 
to the practical conditions he found in one country in Southern Europe and 
one in Africa. At this writing he is continuing his service in another African 
country, especially oriented to training. 

The book deals very competently with the application of sampling theory at 
an intermediate level, the operational conditions prevalent in many develop* 
ing countries today are thoroughly considered. Taking into account the 
fact that most countries are now relying increasingly on sampling methods, 
the book bridges the gap between the highly theoretical material available 
and the application of methods in conditions that are far from optimum. 
This book is a very welcome addition to a literature which is far too scanty. 

m ‘ ?* L !?" ard ’ Special Adviser P. j. Loftus, Director 
United Nations Institute Statistical Office 

for Training and Research United Nations 




Preface 


* 


My objective in writing this book has been to provide an up-to-date account 
of sampling theory as it has been developed during the last three decades. 
The book is intended for students who want to learn sampling theory at an 
intermediate level and for research workers who need to be familiar with 
the latest developments in the subject. It should also serve as a reference 
work for the practicing statistician. The presentation is systematic and 
rigorous, and is based on the author's lectures to postgraduate students at 
the Indian Statistical Institute, Calcutta, and at Lucknow, Agra, Beirut, 
Athens, Addis Ababa, and Ibadan. 

Probability theory forms the basis of sampling methods, and I have made no 
secret of this fact. A good working knowledge of algebra, calculus, and 
probability on the mathematical side, and of general statistical methods 
and elementary estimation theory on the statistical side is essential for a 
proper understanding of the rigorous development of sampling theory. 
Here and there, I have not hesitated to give an elegant proof which incorpo¬ 
rates rather advanced mathematical tools. Most of the theory has been 



presented in the form of theorems, which are followed by remarks on the 
practical use of the results obtained and their interrelationships. Because 
sampling theory is meant to be used in practice, I have taken every oppor¬ 
tunity to direct the reader’s attention to the practical value of the results 
obtained. 

I have started by proving a few theorems in mathematical expectation and 
some other results needed for developing the proofs in subsequent chap- 
ters. For a proper appreciation of sampling theory, Chapter 2 offers back¬ 
ground information concerning the work of a sampling statistician in the 
field. Chapter 3 presents the three basic methods of sample selection- 
simple random sampling, systematic sampling, and sampling with proba¬ 
bility proportionate to size. The use of auxiliary information by way of 
stratification and through ratio and regression estimation is explained in 
Chapters 4 and 5. Sampling and subsampling of clusters, double sampling 
procedures, and the problems peculiar to repetitive surveys form the sub¬ 
ject matter of Chapters 6 and 7. How to plan the survey and analyze the 
data in the presence of response errors is discussed in Chapter 8. Develop¬ 
ments not previously included, such as the stability of variance estimators, 
subpopulation analysis, etc., are discussed in Chapter 9. This is followed 
by a long exercises section, which forms an integral part of the book. The 
exercises carry the general theory beyond the stage reached,in the text. 
They are based on about one hundred research papers published on the 
subject in different journals. The more difficult exercises are solved as 
they are presented. In all cases, references are given and the exact 
advance made is indicated. 

In addition to the excellent books by Hansen et al., Cochran, Yates, Suk- 
hatme, and Deming, there are a large number of research papers on sam¬ 
pling methods and theory. It is, therefore, difficult to make a detailed 
acknowledgement of all sources of material included in the book. Except 
for recent developments, no attempt has been made to trace the original 
sources of sampling theory. 

This book could not have been written without the opportunity, made possi¬ 
ble by the United Nations and the government of Greece, to use sampling 
methods on a variety of populations, for which I shall ever remain grateful. 
Also, thanks are due to Mrs. Kondouli-Baima for her patient help in typing* 

DES PAJ 




Contents 


foreword 

PREFACE i: 

CHAPTER 1 


CHAPTER 2 



* -«6 


vii 


mathematical PRELIMINARIES 1 

1.1 Introduction. 1.2 Samole ^ 

Random variables. 1.2.3 Illustrations. 1.3Tp«ied I ' 2 value° b fa‘v' ' 1 
and covariance. 1.5 Variance of products if c„„dn . ' 

1.6.1 Conditional variance and covariance IJ Tha Trhl J “ peC,atl 

u An example. 1.5 In.radass correlation coSn^^1^ 
function. 1.11 Minimum variance .mhimc ! 1,10 The best wei 

theorems. References estimation. U 2 Two II 

SAMPLE SURVEY BACKGROUND 19 

U “vfsus 11', r in Pr0blem - C —«'« o. Inter, 

method. 2.5 Plannino H enumera,ion - *■» The role of the sampl 

sampling 2 7 Probahilii ^ eXecutlon °* sam P |e surveys. 2.6 Judgm, 
mpnne. 2.7 Probability sampling. 2.8 Formation of estimators. 




Unbiased estimation. 2.10 Precision of estimators. 2.11 Biased estimat 
2.11 Confidence intervals. 2.12 The question of cost. 2.12 The fundame^* 
principle of sample design. 2.14 Scope of this book. Reference. enta * 


CHAPTER 3 BASIC METHODS OF SAMPLE SELECTION 33 

3.1 Simple random sampling. 3.2 Estimation in simple random sampli 
Estimation of sampling error in wtr (without-replacement) sampling, n 
Sampling with replacement. 3.5 A general procedure. 3,8 A better estj 
mator in wr sampling. 3.7 Estimation of proportions. Systematic sarn- 
piing. 3.9 Estimation in systematic sampling. 3.10 An alternative expres- 
sion for the variance. 3.11 Estimation of variance in systematic sampling 
3.12 Sampling with unequal probabilities. 3.13 An alternative sampling pro. 
cedure. 3.14 Estimation in wr pps sampling. 3.15 Comparison with sam- 
piing with equal probabilities. 3.16 Sampling without replacement with 
unequal probabilities. 3.17 A more general selection procedure. 3.1| 
Another type of selection procedure. 3.19 Estimation procedures. 3.20 
Estimation of variance. 3.21 Two yseful relations. 3.22 An alternative 
expression for the variance. 3.23 Comparison of wtr and wr schemes. 3.24 

Another procedure of estimation in wtr sampling. 3.25 Estimation of vari¬ 
ance. References. 


CHAPTER 4 STRATIFICATION 61 

4.1 Introduction. 4.2 Estimation in stratified sampling. 4.3 Allocation of 
sample to strata. 4.4 Allocation in simple random sampling. 4 . 4.1 X-propor- 
tional allocation. 4 . 1.2 Estimation of proportions. 4.5 Allocation in unequal- 
probability sampling. 4.6 Formation of strata. 4 . 6.1 Proportional allocation. 
4 . 6.2 Equal allocation. 4 . 6.3 Optimum allocation. 4 . 6.4 Strata of equal 
aggregate size. 4.7 The number of strata. 4.8 Some practical situations. 

4 . 8.1 The method of collapsed strata. 4 . 8.2 Estimation of gain due to strat¬ 
ification. 4 . 8.3 Dependent selection. 4 . 8.4 Estimating several parametric 

functions. 4.9 The stratum of nonrespondents. 4.10 Latin Square stratifica¬ 
tion. References. 


CHAPTER 5 FURTHER USE OF SUPPLEMENTARY INFORMATION 85 

5.1 Introduction. 5.2 Ratio estimation. 5.3 Bias of the ratio estimate. 5.4 
An approximate expression for bias. 5.5 Mean square error. 5.6 Bounds 
on the MSE. 5.7 Comparison with simple average. 5.8 Sample estimate of 
MSE. 5.9 Unbiased ratio estimation. 5.10 The variance of the unbiased 
ratio estimator. 5.11 Relative precision of the unbiased ratio estimator. 5.U 
Unbiased ratio-type estimators. 5.13 Difference estimation. 5.14 Regres¬ 
sion estimation. 5.15 Use of multiauxiliary information. 5.16 The case of 
two x-variates. 5.17 Multivariate ratio estimation. 5.18 Ratio estimation in 
stratified sampling. References. 


CHAPTER 6 SAMPLING AND SUBSAMPLING OF CLUSTERS 107 

6.1 Introduction. 6.2 Single-stage cluster sampling. 6 . 2.1 Estimation 0# 
proportions. 6 . 2.2 Estimation of efficiency of cluster sampling. 6 . 2.3 Other 



estimators In cluster sampling. 6.3 Multistage sampling. 6 . 3.1 Calcula- 
tion of variance. 6.3.2 Estimation of variance. M Selection of psu’s with 
unequal probabilities. 6.5 Selection of psu's with replacement. 6.5.1 Selec¬ 
tion with replacement (scheme C). 6.6 Comparison of schemes A and B. 

6.6.1 Comparison based on the sample. 6.7 Stratified multistage sampling! 
6.8 Estimation of Ratios. 6.8.1 Sampling without replacement. 6.8.2 Sam¬ 
pling with replacement. 6.8.3 Extension to stratified sampling. 6.9 Choice 
of sampling and subsampling fractions. 6.9.1 Choice of optimum probabili¬ 
ties. 6.10 Some useful multistage designs. 6.10.1 Randomized systematic 
sampling of psu’s. 6.10,2 One psu per stratum. 6.10.3 One psu per random¬ 
ized substratum. 6.10.4 Psu’s selected with pps of remainder. References. 


CHAPTER 7 DOUBLE-SAMPLING PROCEDURES AND REPETITIVE SURVEYS 139 

7.1 Introduction. 7.2 Double sampling for difference estimation. 7.2.1 
Independent samples. 7.3 Double sampling for pps estimation. 7.3.1 The 
case of independent samples. 7.4 Double sampling with pps selection. 

7.4.1 Independent samples. 7.5 Double sampling for unbiased ratio esti¬ 
mation. 7.6 Double sampling for biased ratio estimation. 7.6.1 Comparison 
with the difference estimator. 7.7 Double sampling for regression estima¬ 
tion. 7.8 Double sampling for stratification. 7.9 Repetitive surveys. 7.9.1 
Sampling over two occasions. 7.9.2 Minimum variance current estimates. 
7.9.3 Estimation of change. 7.9.4 Estimation of sum on two occasions. 7.10 
Regression estimation in repetitive surveys. 7.11 Sampling on more than two 
occasions. 7.12 A useful procedure. References. 


CHAPTER 8 NONSAMPLING ERRORS 165 

8.1 Introduction. 8.2 Response errors. 8.3 Response bias: 8.4 The analy¬ 
sis of data. 8.5 The optimum number of interviewers. 8.6 Estimation of 
variance components. 8.7 Some restricted models. 8.8 Uncorrelated* 
response errors. 8.9 Estimation of response bias. 8.10 Extension to other 
sampling designs. 8.11 Response and sampling variance. 8.11.1 Applica¬ 
tion to estimating proportions. 8.11.2 Estimation of simple response vari¬ 
ance. 8.12 The problem of nonresponse. 8.12.1 Practical procedures. 8.13 
Some examplesrof sources of error. References. 


CHAPTER 9 OTHER DEVELOPMENTS 189 

9.1 Introduction. 9.2 Variance estimation. 9.2.1 Variance estimation in 
stratified sampling. 9.2.2 The number of degrees of freedom. 9.2.3 The 
method of random groups. 9.2.4 Variance estimation in wtr sampling. 9.2.5 
Variance estimation in randomized pps systematic sampling. 9.3 Estimation 
for subpopulations. 9.3.1 Subpopulation analysis for other designs. 9.4 
The best linear estimator. 9.5 The method of overlapping maps. 9.5.1 
Changing selection probabilities. 9.6 Two-way stratification with small sam¬ 
ples. 9.7 Systematic sampling. 9.7.1 Data exhibiting periodicity. 9.7.2 
Populations showing trend. 9.7.3 fi jtocor-elated populations. 9.8 Con¬ 
trolled selection. 9,9 A general rule for variance estimation in multistage 
sampling. 9.10 Sampling from imperfect frames. 9.11 Sampling inspection. 

9 .11.1 Double-sampling plans. References. 




EXERCISES 225 

REFERENCES TO EXERCISES WITH REMARKS 263 
APPENDIX 1 REPORT ON AN ACTUAL SAMPLE SURVEY 
APPENDIX 2 PRINCIPAL NOTATION USED 282 
APPENDIX 3 RANDOM NUMBERS 284 


INDEX 287 




SAMPLING THEORY 





CHAPTER ONE 


mathematical preliminaries 


1.1 INTRODUCTION 

As the following chapters will show, probability forms the basis of sampling 
theory. Although some knowledge of probability theory is assumed on 
the part of the reader, we shall begin with a rapid review of the important 
concepts involved. A number of theorems will then be proved for sub¬ 
sequent use in the development of sampling theory. A few other useful 
results on the mathematical side, too, will be included in this chapter. 


1.2 SAMPLE SPACE AND EVENTS 

We shall be concerned with random experiments, the outcomes of which 
depend on chance. The results of a random experiment will be called 
sample points, and the aggregate of all sample points produced by the 
experiment will be called the sample space. Every outcome of the experi¬ 
ment is described by one, and only one, sample point. Any aggregate of 


2 


SAMPLING THEORY 

the sample points will be called an event, which will be said to contain 
those points. We shall say that the event H has occurred if the sample 
point representing the outcome of the experiment is contained in it. The 
event H consisting of all points not contained in H will be called the 
complementary event. The totality of sample points contained in at 
least one of the two events Hi and H 2 will be called the sum (or union) of 
Hi and Hi and will be denoted by Hi -f H 2 . Further, the aggregate of 
sample points contained in both of the events Hi and H 2 will be called the 
product (or intersection) of the two events and will be denoted by HiH t . 
The events Hi and H 2 are mutually exclusive if they have no points in 
common. 


1.2.1 PROBABILITY 

We shall be dealing with sample spaces in which the number of points is 
either finite or countably infinite (enumerable). Let the points of such a 
sample space be denoted by Ei, E 2 , .... With each point Ei will be 
associated a nonnegative number, called the probability of Ei (Feller, 
1950) and denoted by Pr(Ei) such that 2Pr(P t ) = 1. The probability 
of an event G, denoted by Pr(G), is then defined to be the sum of the 
probabilities of all sample points contained in it. For two events Gi and 
G 2} it is easy to establish that 

Pr(Gi + G 2 ) = Pr(Gi) + Pr(G 2 ) - Pr(GiG 2 ) (1.1) 

Further, it is convenient to express Pr(GiG 2 ) as 

Pr(GiG 2 ) = Pr(Gi)Pr(G 2 \Gi) (1-2) 

where Pt(G 2 \Gi) is called the conditional probability of the event Gt t 
given that the event G x has occurred. When Pr(G 2 \Gi) = Pr(G 2 ), the 
event G 2 will be said to be independent of G x . In this case we have 

Pr{GiG 2 ) = Pr(G x )P r (G 2 ) (1* 3 > 


1.2.2 RANDOM VARIABLES 

Let there be a random experiment generating a sample space with i 
sample points E x , E 2) . . . and the associated probabilities Pr(E x ), Pr( E ‘ 
.... A function on this sample space will now be defined. Let the 

a " umber U ia associated with each point of the camp 

Rotate /°B° Wmg h ' S TUle ’ - We aS8ign real “umbers u„ ..*? * 

points E h Bt . respectively. By collecting all points to which *> 

to^ metn”that “the' Bd ’'? formtheev ent V = u. ,thich will be interpret 

to mean that the random variable V takes the value The set 


MATHEMATICAL PRELIMINARIES 


3 


relations: 


Pf(U = Ui) = g(ui) 2(K u.) = 1 (t = 1 , 2, . . .) (1.4) 

defines the probability distribution of the random variable U. 

Now let there be two random variables U and W defined on the same 
sample space. Let U assume values (i = 1 , 2 , . . .) with probabilities 
g(ui) (i = 1 , 2 , . . .), and let W assume the values Wj with probabilities 
h(wj) (j = 1, 2, . . .). Then the set of relations: 


Pr{U = m, W = wj) = p(ui,Wj) 

22p(Ui,Wj) = 1 



(1.5) 


defines the joint probability distribution of U and W. 

The conditional probability of the event W = Wj, given that the 
event U = Uj has occurred, will be defined as 


Pr(W — Wj\U = Ui) = 


Pr(U = Uj, W = wj) 
Pr(U = m) 


p(Uj,Wj) 

g(uj) 


( 1 . 6 ) 


In general this conditional probability will be different from the absolute 
probability Pr(W = Wj), and the two random variables will be called 
dependent. In case 

Pr(W = Wj\ U = Ui) = Pr(W = w,) 
we have Pr(TJ = Uj, W = w 3 ) = Pr(U = Ui)Pr(W = Wj) 
or p(ui,Wj) = g(ui)h(wj) 

If this equation holds for all combinations of Ui,Wj, the random variables 
U and W are called independent. 


1.2.3 ILLUSTRATIONS 

2. There is a population (aggregate) of four units (objects) A\, A 2 , A 3 , 
and A 4 . If a pair of different units is to be ^elected, there will be six 
pairs in all, namely, AiA i} AiA h A^A^, A 2 A Z , A 2 A a , ^ . 1 4 . We agree 
to assign equal probabilities to all pairs so that each pair has a probability 
of Y % . Thus the random experiment consists in selecting two objects 
from a group of four. There are six outcomes of this experiment, giving 
six points in the sample space associated with this random experiment. 
Each point has a probability of the sum of probabilities of all the six 
points being unity. Consider the event Gi that A3 is included in the pair 
selected. There are three points in the sample space, namely, A1A3, A2A3 
and A3A4, which give rise to the event G\. Since each of these has a 
probability of } 4 > probability of the event Gx or Pr(Gi) = % = 
Consider another event G 2 in which A\ is included in the pair selected. 



4 


SAMPLING THEORY 


f 


Obviously, Pr(G 2 ) = The event G 2 G 2 is said to occur when both ) 
and G 2 occur. Since there is only one sample point, A 3 A 4 , that gives ) 
rise to the joint appearance of A 3 and A 4 , Pr(GiG 2 ) = Y- The event 
G\ + G 2 that at least one of Gi,G 2 occurs contains the points AiA 3 , A X A 4| 
A2A3, A 2 A 4 , and A 3 A 4 , and so Pr(Gi + G 2 ) = %. Given that the event 
Gi has occurred, the relevant sample points are AiA 3 , A 2 A 3 , and A 3 4 4 . 

In this part of the sample space the event G 2 has only one sample point 
namely A 3 A 4 . To make the total probability unity on this part of the 
sample space, each point is given a probability of Thus 

= Pr(G 2 1 Gi ), which is the conditional probability of G 2 given that G\ has 
occurred. This is equivalent to saying that Pr(G 2 \G\) = Pr(GiG 2 )/Pr(G l ) 
on the original sample space. Since Pr(G 2 \Gi) = H ^ H - Pr(G 2 ), the > 
two events G 1 and G 2 are not independent. The absolute probability of 
G 2 is different from the conditional probability of G 2 given that G\ has 
occurred. 

2. Consider a collection of 16 different tickets, each bearing the pair 
AiAj(i,j = 1 , 2 , 3, 4). The random experiment consists in selecting a 
ticket and noting down the pair on it. Then the sample space consists 
of the following 16 points: 




A\A.\ A\At A\A% AiAi A^A\ A2A2 A%Ai A2A4 

A%A\ AzA 2 A z A 3 AzAi A*Ai A4A 2 A4A1 A4A4 


Suppose that the eight sample points in the first row have a probability 
°f 3^4 each and that the other points have a probability of 3^2 each. 
Consider the event G\ in which A 3 is the first member of the pair selected. ' 
This event has four points, A z A h A 3 A 2 , A 3 A 3 , A 3 A 4 in it, and therefore 
Pi'iGx) = %. Similarly, the event G 2 in which A 4 is the second member 
of the pair has four points AxA 4 , A 2 A 4 , A 3 A 4 , A 4 A 4 , and so Pr(G 2 ) = H- 
The event GiG 2 , which consists of the joint occurrence of Gi and G 2 , has 
just,one point in it and that is A 3 A 4 . Hence Pr{G x G 2 ) = \{ 2 . Given 
that Gi has occurred, the only relevant sample points are A 3 A 1 , A 3 Ai, 
AzAz, A3A4. In the sample space of this event, the event G 2 has one 
sample point, namely A 3 A 4 . Thus Pr (G 2 |(?i) = % = Pr(G 2 ). Whether 
or not we know that Gi has occurred, the probability of G 2 is the same. 
The two events Gi and G 2 are independent. We find that 

n 

Pr(GiG 2 ) = Pr(Gi)Pr(G 2 ) 


in this case, 
seven points 


The event Gi + G, -;hat at least one of Gi or G 2 occurs has 
in it, from which we find that Pr{Gi -f G%) = 


MATHEMATICAL PRELIMINARIES 


3. Suppose a population consists of five families F h F, F, F. F witt, 
incomes (measured on some scale) of 20, 60, 40, 20, and 60,’resWCly 
The random experiment consists in selecting a pair of different families 
without regard to their order. Then the sample space has 10 points 
Let each point have a probability of Ho- With each point we associate a 
value, for example, the mean income of the two families. We may define 
a random variable or random function U on this space by saying that for 
any outcome (which will be a pair of families) of the experiment the 
random variable U takes up the value associated with the point. Table 
1.1 then gives the values of U for different points of the sample space. 


Tabic 1.1 The random function U 


Sample point 

Pr 

Value of U 

Sample point 

Pr 

Value of U 

Fift 

Ho 

40 

Ft, Ft , 

Ho 

40 

F lf F» 

Ho 

30 

F it F t 

Ho 

60 

F l} F< 

Ho 

20 

F t}F 4 

Ho 

30 


Ho 

40 

F t ;F s 

He 

50 

Ft,Ft 

Ho 

50 

F4,F f 

Ho 

40 


From this table we find that Pr(U = 20) = }{o,Pr(U = 30) = Ho, 
Pr(U = 40) = Ho, Pr(U = 50) = Ho, and Pr(U = 60) = Ho- Thus 
the random variable U tal$es up the values 20, 30, 40, 50, and 60 with 
probabilities H o, Ho, Ho, Ho, and Ho, respectively. This is called the 
probability distribution of the random variable U. 

For the same population of the five families, consider another experi¬ 
ment, consisting of the selection of a pair of families; repetitions are 
included and due regard is paid to the order in which the families occur in 
the pair. The sample space then contains the 25 points: 


F^Fy 

F t ,Fy 

F t ,Fi 

F t ,Fy 

Ft,Fy 

F h Ft 

Ft,Ft 

Ft,Ft 

Ft,F 2 

Ft,Ft 

F h F, 

F 2 ,F t 

Ft,Ft 

Ft,Ft 

Ft,Ft 

Fy f F 4 

F it F4 

Ft,Ft 

Ft,Ft 

Ft,Ft 

*F u F t 

Ft,F 6 

Ft,Ft 

Ft,Ft 

Ft,Ft 


and we agree to associate a probability of H 5 with each point. Consider 
a random variable V which assumes a value equal to the income of the 
family occurring first in the pair. Similarly, the random variable . 
takes on the value equal to the income of the second member of the pair. 




SAMPLING THEORy 


C 

The values of V and W for the 25 sample points are: 


v,w 

V,W 

V,W 

v,w 

V,W 

20,20 

60,20 

40,20 

20,20 

60,20 

20,60 

60,60 

40,60 

20,60 

60,60 

20,40 

60,40 

40,40 

20,40 

60,40 

20,20 

60,20 

40,20 

20,20 

60,20 

20,60 

60,60 

’ 40,60 

20,60 

60,60 


The joint probability that V = 20 and W = 20 is ^5, since there are 
four points involved. Similarly, Pr(V = 40, W = 60) = Suchrela* 
tions could be conveniently exhibited in the form of the two-way table 
(Table 1.2). The entries in the body of Table 1.2 are probabilities, such 


Table 1.2 Joint distribution of V and W 



20 

w 

40 

60 

Pr{V) 

20 



Ms 

H 

V 40 


As 

Ms 

H 

60 


Ks 

Ms 

% 

Pr(W) 

2 A 

H 

% 

% 


as Pr(V = v, W = w), and they specify the joint distribution of V and W. 
The margins give the probability distributions of V and W. It will be 
noted from Table 1.2 that Pr(V = v, W = w) = Pr(V = v)Pr(W = w ). 
Such random variables are called independent. Given that V = 20, the 

conditional probability that W = 60 is given by % = % and we see that 


Pr{W = 60|F = 20) = % = Pr(W = 60). It may also be noted that 
V and W have the same probability distribution. 

For the same population of five families, consider now the selection 
o a pair of different families, order being considered. The sample space 
then contains 20 points. Suppose each point has a probability of Mo- 
Give random variables X and Y values equal to the income of the first 

amily m the pair and the second family in the pair, respectively. Then 
their joint distribution is given by Table 1 3 

ti „„ JS" thl ® C ™* Pr(X = *. Y = y) * Pr(X = x)Pr(Y = y) , Thecondi- 

randomvnrfhl v are the Sam ® “ the abs ®lute probabihties. The 
random variables X and Y are not independent 





mathematical preliminaries 


Tabl» 1.3 Joint-distribution of X and Y 



mm 

Pr(X) 

20 

© 

▼N. 

O 

O 

H 

X 40 

%o Ho Ho 

H 

60 

Ho Ho Ho 

H 

Pr{Y) 

H H H 

H 


1.3 EXPECTED VALUE 

With the background of Sec. 1.2, we prove a number of results in proba¬ 
bility theory which will be used in the chapters that follow. Let there 
be a random variable U taking the values Ui{i = 1, . . . ,k) with proba¬ 
bility pi (i = 1, . . . , k), 2 pi = 1. Then the expected value of U is 
defined as 

E(V) = U (1.7) 

i 

It follows that the expected value of a U (where a is a constant) is 
E(aU) = aE(U), since the random variable aU takes the values att, 
(i = 1 , . . . , k) with probabilities p». In the same way 

E{aU + b) = aE(U) + b 

where & is a constant. In general, let $>(,£/) be a function of U. Then 

E<f>(U) = 2 pi<t>(ui) 

Let there be another random variable W taking values Wj (j = 1 , . . . , 1) 
with probabilities p jf 2 pj = 1 . Further, let the joint distribution of U 
and W be given by the relations 

Pr(U = Ui,W = Wj} = pa J X Va = 1 

i j 

Then, it is obvious that 

Pi = Pr(U = Ui) = X Pa Pi = Pr(W = wj) = X Pa 

i i 

Consider now the random variable Z = U + W. The expected value of 
Z can be written down from first principles by listing the values that Z 
can take up along with their probabilities. But this can be done more 
conveniently by using the following theorem. 







SAMPLING THEORY 


Theorem 1.1 

If U and W are two random variables , 

E(U + W) = E(U) + E(W) (1.8) 


proof By definition 

B(U + W) = X X (>« + w;)p 0 - 

= 2 «< 2 va +2 w i 2 p« 

* i y * 

= 2 p.«. + 2 ?/»<■ = £(to + «on ■ 

In words, the theorem states that the expected value of the sum of 
two random variables is the sum of the expected values of the random 
variables. This theorem is remarkable in the sense that it holds whether 
or not the variables U and W are independent. 


Remark Theorem 1.1 can now be generalized to n random variables 


It states that 



n 


E(Ui + Ut + • • • + Un) = X E (Ui) 


»-1 


Remark Given the constants c x , c 2 , . . . , c», we have 

E(2 Ci Ui) = 2 CiE(Ui) 

In case the variables U and W are independent, we have 

Pr(U = Ui,W = Wj) = Pr(U = Ui)Pr(W = w ,) 
which means that 

Pa = ViVi 

Then we prove the following result. 

Theorem 1.2 

If U and W are independent 

E(UW) = E(U)E(W) 


PROOF 


E(UW) - l^uiwDpa = ZptUiZpjWj = E(U)E(W) 


.(1.9) 


( 1 - 10 ) 



mathematical preliminaries 


9 


Corollary 

If h(U) and U{W) are any two functions of the independent random 
variables U and W, we have 

E[fx(U)MW)] = E\Sm\E[U{W)] 

Now let U and W be not necessarily independent. For a given value 

m* of U, the conditional expectation of W would be (X Wj) / p<, which 

i 

we may call E 2 (W) for brevity. Then we prove the more general 
Theorem 1.3. 


Theorem 1.3 

E(UW) = E[UE{W\U)] = E[UE 2 (W )] 
where E 2 is the conditional expectation for a given value of U. 


PROOF 


E(VW) = 22 *“*»* = 



= 2 
= 2 


X PijWi 

PiUi J -= } p x UiE 2 (W) 

Pi ** 

p x E 2 (UW) = E[UE 2 (W)] m 


Corollary 


If U and W are independent, E(W\ U) = E(W), and we get Theorem 1.2. 


Example Consider the random variables X and Y having the joint 
distribution given in Table 1.3. We shall calculate E(X), E(Y), 
E(X + Y), and E(XY) in this case. 

E(X) = 20 X % + 40 X M + 60 X % = 40 = E{Y) 

E(X + Y) = E(X) + E(Y) = 80 

For X = 20, 40, and 60 the values of E Z (XY) are 20 X 45, 40 X 20, and 
60 X 35, respectively. Hence, E(XY) = 1,360. 


U VARIANCE AND COVARIANCE 

Let there be a random variable U with E(U) = U. The variance of U 
is defined as 

V{U) = E{U - If) 1 = E(U )* - 0* = l P i(ui - 0)* (1.11) 



^ SAMPLING THEORy j 

The positive square root of V(U) is called the standard deviation of V ‘ 
and is denoted by <r(l7). We note that if V(U) is smaU, each term in the ( 
sum 2pi(«i - t )‘‘ is small. This means that a value «, for which 
|(u, — l?)| is large must have a small probability p,. In other words, i n 
case of small variance, large deviations of the random variable from its 
expected value are improbable. Consider now the variable Z = uf7 -f 6, 
We have E(Z) =aU + b, V(Z) = E(Z - Z) 2 = a‘E(U - U)- or 

V(aU + b) = o 2 7(C7) 

Let there be another random variable W defined on the same sample space 
as U. Let the joint distribution of U and W be specified by the relations 


Pr{U = Ui,W = Wj) = pi. 

Then 

E(W) = 2 pm = w v(W) = e(w - wy = z Vj (W} - i vy 

We shall now introduce another concept, called the covariance of U and 
W. It is defined as 

Cov (17, W) = E[(U - U)(W - ft)] = E(UW) - E(U)E(W) (1.12) 

Obviously, Cov (U,U) = E(U - U ) 2 = 7(17). The covariance of U and 
T7 divided by the product of their standard deviations is called the correla¬ 
tion coefficient of U and W and is written as 


P(U,W) = 


Cov ( U,W ) 
<r(U)a(W) 


(L13) 


If U and 17 are independent E(UW) = E(U)E(W) by Theorem 1.2. 
Then Cov ( U y W ) vanishes, and so does the correlation coefficient. In 
general we shall prove that p lies between -1 and +1. This folio ws from 
the observation that the expected value of a random variable with non¬ 
negative values must be nonnegative. 

Hence 


% 


Expanding, we get 
which proves that 


E 


U-U 

<U) 


w - w v 


> 0 


2[1 ± p(U,W)] > 0 

Ip(C/,^)| < 1 


A more general result is contained in Theorem 1.4. 




mathematical preliminaries 


IS 


Thatortm L4 

Lei U and W be two random variable». Then 

E(UW) < [E(IP)E(W*)]* 


(U4) 


proof For any real f we have £(E/ + fWT > 0. Hence by Theorem 
1 1 tit) - E(U') + PE(W') + 2tE(UW) > 0. The function /(f) is a 
quadratic in t. Since /(f) > 0 for all f, its discriminant cannot be 

nositive. which proves the theorem. 

The following theorem is of the greatest importance in samp mg 

theory. 


Theorem 1.5 


Let Ui (i = 
constants. 


1 _ _ _ , n) be n random variables and o. (i 
Then 

V (J aiUi) =- J 2 Cov ( u h u t) 


1, ... ,n) be 
(1.15) 


proof We have E (£ a.U.) = J fkft'by Theorem 1.1. By defini- 

tion, 

f (X a ‘ (; 0 = ® [X ° i(K _ ^ J' 

= E V V OiO,/ £A - Pi) (Ui Ui) 

= 22 ’(HOiEKUi - U,)(Ui - Ui)] 

= 2 2«» c wM ■ 


Remark Since Cov (IA,£A) = 7(tA), we can also state that 

v (2 OiUi ) = X a?V(Ui) + 2 2 X. Cov {Ui ' Vi) (116) 

or V (X <H U t ) = X ^(U.) + 2 X X OiainAUMUi) 

where />,/ is the correlation coefficient of Ui and Ui- 

Remark If the random variables Ui are mutually uncorrelated that is, 
pij = 0, we have 

V(2aiUi) = 2 a*V(Ui) 

A more general result, that is easy to prove, is contained in the following 
theorem. 



12 


SAMPLING THEORy 


Theorem 1.6 

ut there be two linear functions of the sets of random vanables (V hVi , 
. . . ,U m ), (Wi,W lt . ■ • namely, 


n 


Then 


U = V OiUi w = l bjWj 

i-i r- 1 

Cov (U,W) = J X a,6y Cov (EA-,17/) 


(1.17) 


* ; 


1.5 VARIANCE OF PRODUCTS 

If X and F are independent random variables, a simple expression can be 
found (Goodman, 1960) for V(XY) in terms of E(X), E(Y), V(X), and 
7(F). Let E(X) = X,E(Y) = Y,5x = (X — X)/X,by = (F - Y)/Y. 
Then E(XY) = X? by Theorem 1.2. Also 

X = X(l + bx ) F = 7(1 + 52/) 

Now 

V(XY) = E[AF - XF] 2 = (XY) 2 E[bx + by + bxby] 2 

_ ,yv« fw , nn . 

11 L x* + p* + jt ! ? s J 

Hence we get the important result 

V(XY) = [E(Y)] 2 V(X) + [E(X)] 2 V(Y) + 7(X)7(F) (1.18) 

Further, let Ev(X) = V(X), Ev(Y) = 7(F). Then we shall prove that 
E[X 2 v(Y) + Y 2 v(X) - v(X)v( F)] = V(XY) 

The proof follows from the observations that 


and 


E[X 2 v(Y)) = E(X 2 )Ev(Y) = [X 2 + 7(X)]7(F) 
E[Y 2 v(X) ] = [F 2 + 7(F)]7(X) 

E[v(X)v(Y)] = 7(X)7(F) 


Further reading See Exercise 84 for an extension to the situation in 
which the random variables are not necessarily independent. 

LC CONDITIONAL EXPECTATION 

As the subject of this book develops, it will be found that quite often 
expectations and variances of random variables have been computed by 





mathematical preliminaries m 

using the conditional argument, since it makes the derivation easier. In 
this section, two very important theorems involving conditional expecta¬ 
tions will be proved. Survey statisticians have been using these results 
for a long time, but they first appeared in print in the book by Hansen, 
Hurwitz, and Madow (1953). Let Hj (J - 1, n) be & set of 

mutually exclusive events [Pr{H k Hi) = 0] of which one necessarily occurs. 
Then any event can occur only in conjunction with some Hj. Thus the 
probability that a random variable U take the value Uj would be given by 

Pr(U = m) - J Pr(U = Ui,H = Hj) 

= J Pr(Hj)Pr(U = Ui\Hj) 
i 

Also, the conditional expected value of U given Hj would be 

E(U\Hj) = 2mPr(U = Ui\Hj) 

For convenience in writing we shall denote E(U\Hj) by E t (U). 

Theorem 1.7 

The expected value of a random variable U is given by 

E(U) = E[E(U\Hj)] (1-19) 


proof We have 
E(U) = 2 UiPr(U = ifc) 

= V * V Pr(U = n, H = Hi) = l «, 2 Pr< - U = w|H f )Pr(H,) 

T i 

= 2 Pr(Hj) 2 UiPr(U = Ui\H,) 

= 2 Pr(Hj)E(U\Hj) = l Pr(Hj)E 2 (U) 

= E[E t (U)] ■ ■ 

Remark Symbolically we may write E(U) = EiE 2 (U), where E 2 (U) is 
the conditional expected value of U given H, and E\ stands for the sub¬ 
sequent procedure of taking the expectation (over the space of H). 

1.6.1 CONDITIONAL VARIANCE AND COVARIANCE 

With the help of Theorem 1.7 it is easy to obtain the covariance of two. 
random variables in terms of conditional expectations. As before, we 
shall denote E(U\Hj) by E 2 (U). Given Hj, the conditional covariance 
of U and W is Ct(U,W) = E 2 (UlV) - E 2 (U)E 2 {W). 


14 


SAMPLING THEORY 


Theorem 1.8 

Cov ( U,W ) = C(U,W) = E,C 2 (U,W) + Ci(fi,vps,w). 

proof For simplicity, let 

E*(U) = a E 2 (W) = y 

Then Cov (ZJ,W) = E(UW) - E(U)E(W) 

= E X E 2 {UW) - E x {x)E x (y) 

= E X [E 2 (UW) — xy] -f- E x (xy) 

= E x C 2 (U,W) + C X (E 2 U,E 2 W) 


- E x (x)E x (y) 


Corollary 

Cov (U,U) = E X C 2 (U,U) + C x (E 2 U,E 2 U) 
or 

V(U) = E x V 2 (U) + V x E 2 (U) (1.20) 

Thus the variance of a random variable is the sum of the expected value 
of the conditional variance and the variance of the conditional expected 
value. 


1.7 THE TCHEBYCHEFF INEQUALITY 

The interpretation of the variance of a random variable as a measure of 
the degree of concentration around the expected value is made clear by the. 
following inequality due to Tchebycheff. Let U be a random variable 
with E(U) = U. For any i>0we shall prove that 

V(U) > t 2 Pr(\U - U\ > t) (1.21) 

We have 

V(U) = £ (u, - U)% = V («,• - U)‘Pi + 2 («* - Wpk 

j ^ 

where the first summation runs over all j for which \Uj — U\ < t, and 
the second summation runs over all k for which \U k — U\ > t Now 
V(U) >%(u k - U) 2 Vk >t 2 %p k = t 2 Pr(\U - U\> t). From this it 

k k 

follows that 

Pr(\U -0\>t)< ^ 

Letting t = X<r(C7), we have 


Pr[\U -U\> X<r(t/)] < ^ 


15 


MATHEMATICAL preliminaries 


Thus the probability is at most 1/X* that the random variable differs from 
its expected value by more than X times its standard deviation. 


1.8 AN EXAMPLE 

An example will now be given to illustrate some of the theory presented. 
The result obtained will be put to good use in subsequent chapters. Sup- 
nose an urn contains balls of k colors in proportions p* (a - h ■ ■ • . *!» 
L, = 1. The random experiment consists in selecting one ball from the 
urn. We shall associate a probability of p h to the outcome that the ball 
is of color h. Suppose this experiment is repeated n times; care is taken 
that the selected ball is always returned to the urn. Let , denote the 
number of balls of one color (red) and f, the number of balls of another 
color (black) selected in the n repetitions of the experiment. We defane 
Ui as a random variable which has the value 1 if the ith repetition gives a 
red ball and 0 otherwise. Similarly Wi assumes the values of 1 or U 
according to whether the ith repetition produces or fails to produce a 

black ball. Obviously, then 


t x = Ui + U 2 + • * • + U n U = Wi + W 2 + 


+ W n 


If pi, pi are the probabilities of getting a red or black ball respectively at 
any selection, we have E(U t ) = Pi, E(W t ) = Pi. Again, Ui 2 has values 1 
or 0 with probabilities p x and 1 - p h respectively. Hence E(Ui 2 ) = pi, 
and similarly E(Wi 2 ) = pi . Since the variables Ui and U 3 are independ¬ 
ent, Theorem 1.2 gives E{UiUj) .= p x 2 for i*j. Similarly E(WiWj) = pi z 
for i j. And 

V(Ui) = E(Ui 2 ) - E*(Ui) = pi(l - pi) 

V(Wi) = Pi( 1 - Pi) 

Cov (Ui,Ui) = E(UiUj) - E(Ui)E(Uj) = 0 
Cov (Wi,Wj) = 0 


Since the ith selection can produce a ball of only one color, Ui and W, can¬ 
not both take the value of 1 simultaneously, and so E(UiWi) = 0, which 
gives Cov (Ui,Wi) = — P1P2. By the independence of Ui and Wj we get 
E(UiWj) — pipi, Cov (Ui,Wj) = 0. We are now in a position to calcu¬ 
late E(ti),E(ti),V(h),V(t 2 ), and Cov (t lf ti). By Theorem 1.1, E(t x ) = np x 
and E(t 2 ) = np 2 . By Theorem 1.5, we have 7(<i) = np x (l — p x ) and 
7(< 2 ) = np 2 ( 1 — p 2 ). Theorem 1.6 gives Cov (ti,t 2 ) = —np\p 2 . These 
results may be stated in the form of the following theorem. 


Theorem 1.9 

A certain random experiment may produce any of k mutually exclusive events 
A\, A 2 , . . . , A k , the probability of Ai being pi > 0 where Sp» = 1. In a 



SAMPLING THEORY 


series of n repetitions of the experiment let Ai occur U times. Then 

E(U) = npi 
V(U ) = npi{l ~ Pi) 

Cov (ti,tj) = -np t pj 


1.9 INTRACLASS CORRELATION COEFFICIENT 

A population consists of N families each having M members. To each 
member is attached a value measured on some scale (such as age in years). 
The random experiment consists in selecting a family and taking a pair of 
members from the family. We assign the same probability to the selec¬ 
tion of every pair. Random variables U and W are defined which take 
up values associated with the first member and the second member form- 

the , pair selected - Obviously, the probability distributions of U and 
W are identical. Thu S E(U) = E(W ) and V(V) = V(W). Thecorrela- 
tion coefficient of U and W is 

e(V,W) = - ov (SB _ EKU - 0)(W - Jp)] 

V(U) E(V - 0)'* , 

Itere™tfl‘f^'ri 0ri v trafa “ ily) Correlation <=0®“- Le‘ 

be - ^ ? ember of the ith family. There wiU 

bility. It is then easily s^h^ ^ haVing the Same proba - 


Cov 


X X y<i 

2 X - ?) ! 

vm = ±J. _ 

NM 

(U,W) = f f y fee- - f)(ita -Y) 
<-i 4i 4 - i) 


J-10 THE BEST WEIGHT FUNCTION 

Another problem whose solution will K 

There are a number of random variahlL7, needed later is the foUo 

expected value E(U) = a andCovfTn* = • ■ ■ .P) with the 

weights». (f _ l, . «) 2 „ . We want to fin 

a a minimum. By Theorem 1.5, = M V( 

ViSwA) = 22^, = wAw ', where i t\C ^ Cov M). 1 

K ..' »*). “d « is the transpose * J“ “ *5,® 1 

1 Consider now Cai 


MATHEMATICAL PRELIMINARIES 


17 


inequality 


(xy') 2 < (xMx')(yM~ 1 y r ) 


( 1 . 22 ) 


where M is symmetric positive definite. The sign of equality holds if 
and only if xM = \y where X ^ 0 is a scalar. In this inequality we 
substitute x = w, y = e, the vector e(l, 1, . . . , 1), and M = A = (an). 
We then have 

(we') 2 < (wAw')(eA~ 1 e') 

But we' — 2 Wi = 1. Hence wAw' > 1 /eA~ l e', there being equality 
if wA = Xe or w = \eA~ l or we' = \eA~ x e'. But we' = 1. Hence 
X = l/eA~ l e' and so the best weight function is given by 


w 


eA~ x 

eA~ x e' 


(1.23) 


and the minimum variance is l/(eA~ l e'). Now, eA~ l e' = sum of all the 
elements in A -1 , the matrix inverse to A, and the *th component of 
eA~ x = sum of the elements in the ith column of A -1 . Thus the best 
weights are calculable from the elements of the inverse matrix A~K 


1.11 MINIMUM-VARIANCE UNBIASED ESTIMATION 

The following theorem will be used in the sequel to find minimum-variance 
unbiased (MVU) estimators. An estimator T is said to be unbiased for 
estimating the parameter d if E(T) = 6. And if V(T) < V(T'), where 
T ' is any other unbiased estimator of 9, we shall call T a MVU estimator. 

Theorem 1.10 

A necessary and sufficient condition for To to be the MVU estimator of a 
parameter is that Cov (T 0 ,z) = 0 for all z, where z is a zero function, i.e., a 
function with expectation identically zero. 

proof Let To be MVU for the parameter. Then V (To) < V (To + 
e z) for all e. Thus 

V(To) < V(To) + e 2 V(z) + 2e Cov (T 0 ,z) 
or c[2 Cov (To, z) + eV(z )] > 0 

Assume e > 0 and let e -* 0 through positive values. Then Cov (T 0 ,z) > 0. 
Again, assume e < 0 and let € -► 0 through negative values. Then 
Cov (To,z) < 0. Hence Cov (T 0 ,z) = 0. To prove that the condition is 
sufficient, let Ti be any other unbiased estimator of the parameter. Then 
Ti = To + (Ti - To). Now 

V(Ti) = V(To) + V(Ti - To) + 2 Cov (To, T x - To) 


18 


SAMPLING THEORY 


But Cov (To, T 1 - To) = 0, since Ti - To is a zero function. Hence 
V(Ti) > V(T 0 ), which proves that To is MVU (Rao, 1952). ■ 

Corollary 

If To is MYU, Cov (T\ — T 0 , To) = 0 where T\ is any unbiased estimator 
of the parameter. Thus Cov ( Ti,T 0 ) = Cov (To,To) = V(To). This 
shows that the variance of a MVU estimator To can be found by computing 
the covariance of T 0 and any other unbiased estimator of the parameter. 


1.12 TWO LIMIT THEOREMS 

We will conclude this chapter by stating two limit theorems in probability 
(Feller, 1950) which reveal a new aspect of the notion of the expectation of 
a random variable. 

Law of large numbers Let {X*} be a sequence of mutually inde¬ 
pendent random variables with a common distribution. If the expectation 
X = E(X k ) exists, then for every e > 0 as n —► « 

ft( - U- J >iLo (1.24) 

Thus the probability that the average 2 X,/n will differ from the expecta¬ 
tion by less than an arbitrarily prescribed e tends to one. 

Central-limit theorem Let {X*} be a sequence of mutually independ¬ 
ent random variables with a common distribution. Suppose that 
X = E(Xk) and a 2 = V (Xu) exist. Then for every fixed a, b, (a < b) n —* °° 

Pr (° < ^ Xi ~ nX ) <b )^ - *(«) 

where $(z) is the probability that the standardized normal variable does 
not exceed x. 

REFERENCES 

Feller, W. G. (1950). ‘An introduction to probability theory and its applica¬ 
tions.” John Wiley & Sons, Inc., New York. 

Goodman, L. A. (1960). On the exact variance of products. J. Am. Stat. 
55* 

Hansen, M. H., W. N. Hurwitz, and W. G. Madow (1953). “Sample survey 
methods and theory,” vol. 2.. John Wiley & Sons, Inc., New York. 

Rao, C. R. (1952). Some theorems on minimum variance unbiased estimation- 
Sankhya, 12. 




CHAPTER TWO 


SAMPLE SURVEY BACKGROUND 


2.1 INTRODUCTION 

The major aim of this book is to present sample survey theory. But 
theory by itself cannot be fully appreciated without an adequate knowledge 
of the type of problems in which it is to be used. It is therefore a matter 
of considerable importance that the reader have a good idea of the work a 
sampling statistician is called upon to do and the limitations under which 
he works. The purpose of the present chapter is to do just that. 


2-2 THE MAIN PROBLEM 

In its broadest sense the purpose of a sample survey is the collection of 
information to satisfy a definite need. The need to collect data arises in 
every conceivable sphere of human activity. Only a. few examples will 
be given from selected fields. 

Population Most governments nowadays collect information regu¬ 
larly about: the total population (number of persons); its distribution by 



20 


SAMPLING THEORY 


area, sex, age, and other socioeconomic characteristics; the rate of growth 
of the population; internal migration, and so on. These data help 
determining the future needs for such items as food, clothing, housing, 
education, and recreational facilities. Data on internal migration are used 
for assessing the social and economic problems when there are major 
shifts from the rural to the urban areas, for example. Broadly speaking, 
data on the nature and size of the population can be used for determining 
the demand for goods and services and the size and quality of the labor 
resources needed to produce these goods and services. 

Labor Since labor is a key resource in production, data are collected 
on the number of persons engaged in economic activity, the number of 
hours they work, and the average output per man-hour of work. The 
wages and salaries paid to labor determine living levels and the demand for 
goods and services. Data about the distribution of the labor force by 
branch of economic activity give a useful indication of the structure of 
production in the country. Classifications of the economically active by 
occupation can be used to study the capabilities of the labor force from 
the point of view of development projects. Detailed information on the 
unemployed persons is used to find out what type of work they are looking 
for, how long they have been in search of work, what type of training they 
have had, and other pertinent facts. 

Agriculture With rising populations, it is becoming more and more 
important to assess the agricultural resources of the country. The pro¬ 
portion of land under agriculture, areas under different crops, areas under 
pastures and forests, production of food—grains, fruits, etc.—and the 
number and quality of livestock are some of the items of information 
essential to any planned program of national development, especially in 
underdeveloped countries. Data on the number and area of farm holdings 
by size and type of tenure can be used to determine the extent to which 
these factors may be contributing to agricultural productivity as well as 
to devise remedies. 

Industry Collection of information in the industrial sector is no less 
important. The number of industrial undertakings and their kind, the 
number of persons engaged in them, the amount of raw materials con¬ 
sumed, the extent of production of goods and the value added by their 
manufacture are some of the data needed. Data on the capacity of power 
equipment installed in an industrial undertaking can be used to measure 
the extent of mechanization and to decide where efforts to increase capita 1 
equipment should be concentrated. Indices of industrial production? 
when regularly calculated, point to the success or failure in increasing 




k 

1 

r j 

/ 


/ 


I 




production. 

Internal trade Commerce and related services form an import*®* 
part of economic activity. Information is required on the role and char 



SAMPLE SURVEY BACKGROUND 


21 


acter of the wholesale, retail, and service trades. The number of establish¬ 
ments engaged in each trade, by kind of business, the value of sales 
of retail stores, and the value of inventories are some of the items on 
which information is needed to assess business conditions. Figures on 
sales of goods and services at retail can be used as indicators of the level of 
personal consumption. 


2.2.1 CHARACTERISTICS OF INTEREST 

The foregoing discussion suggests that there is a variety of purposes for 
which information is collected. Most frequently, however, interest has 
centered on four characteristics of the universe or population under study. 
These are: population total (e.g., the total number unemployed), popula¬ 
tion mean (the average number of persons engaged by an industrial 
establishment), population proportion (proportion of cultivated area 
devoted to cotton), and population ratio (the ratio of expenditure on foods 
to that on rent). The populations considered are finite in the sense that 

the number of objects contained in them (such as persons, farms, firms, 
stores) is limited. ’ 


2.3 SAMPLE VERSUS COMPLETE ENUMERATION 

Broadly speaking, information on a population may be collected in two 
ways. Either every unit in the population is enumerated (called complete 
enumeration, or census) or enumeration is limited to only a part or a 
sample selected from the population (called sample enumeration or sample 
survey) A sample survey will usually be less costly than a complete 
census because the expense of covering all units would be greater than that 
of covering only a sample fraction. Also, it will take less time to collect 
and process data from a sample than from a census. But economy is not 
e only consideration; the most important point is whether the accuracy 
of the results would be adequate for the end in view. It is a curious fact 
that the results from a carefully planned and well-executed sample survey 
are expected to be more accurate (nearer to the aim of study) than those 
from a complete census that can be taken. A complete census ordinarily 
equires a huge and unwieldy organization and therefore many types of 
errors creep in which cannot be controlled adequately. In a sample survey 
the volume of work is reduced considerably, and it becomes possible to 
employ peraons of higher caliber, train them suitably, and supervise their 

to mat e< ^ ua fy* n a properly designed sample survey it is also possible 
e a valid estimate of the margin of error and hence decide whether 
e resu. s are sufficiently acciy^ey ^o^iplete census does not reveal 



22 


SAMPLING THEORY 


by itself the margin of uncertainty to which it is subject. But there is 
not always a choice of one versus the other. For example, if data are 
required for every small administrative area in a country, no sample 
survey of a reasonable size will be able to deliver the desired information; 
only a complete census can do this. 


2.4 THE ROLE OF THE SAMPLING METHOD 

rtf ’ 1 

sample survey has now come to be considered an organized fact-finding 
instrument. Its importance to modern civilization lies in the fact that it 
can be used to summarize, for the guidance of administration, facts which 
would otherwise be inaccessible owing to the remoteness and obscurity of 
the persons or other units concerned, or their numerousness. Sampling 
surveys allow decisions to be made which take into account the significant 
factors of the problems they are meant to solve. As a fact-finding agency 
a sample survey is not primarily concerned with the sociological or eco¬ 
nomic interpretation of the facts ascertained, although it should supply 
material adequate for such interpretation. Rather it is concerned with 
the accurate ascertainment of the individual facts recorded and with their 
compilation and summarization. How a sample survey is to be organized 
at different levels will depend upon the type of questions it is required to 
answer. There are, however, certain ingredients which are common to 
most large-scale surveys. We shall deal with some of these in the next 
section, providing occasional illustrations from the Greek Household 
Survey (GHS) of urban areas (National Statistical Service of Greece, 
1963). 


2.5 PLANNING AND EXECUTION OF SAMPLE SURVEYS 

The following are some of the main steps involved in the planning and 
execution of large-scale sample surveys. 

rj) [ Objectives The first task is to lay down in concrete terms the objec¬ 
tives of the survey. It is generally found that the sponsoring agency itself 
does not know precisely what it wants and how it is going to use the 
results. The statistician's job is to hold discussions with the sponsors in 
order to make them start thinking in concrete termsy Failure to clarify 
the purpose of the survey will undermine its ultimate value; in the end it 
may be found that the results are not what was really wanted. (In the 
GHS the main purpose was to collect data on the pattern of expenditure of 
urban households to provide a reliable basis for the weighting of the 
planned index of consumer prices.) 


sample survey background 


fax) population to be covered The objectives of the survey should define 
the population the survey is intended to cover. But practical difficulties 
in handling certain segments of the population may point to their elimina¬ 
tion from the scope of the survey. For example,)in a population survey 
it may be found extremely difficult to cover the transient population. 
In an agricultural inquiry in which the intention is to take in every small 
piece of land to determine what it grows, practical considerations may 
force the omission of such places as kitchen gardens.^' In an industrial 
survey all plants employing less than two persons may have to be omitted 
if it is found that it would be extremely difficult to include them. Thus 
the target population would generally be different from the population 
actually sampled. The results obtained will apply to the population 
actually sampled) Sometimes, information is collected in a different 
manner from the omitted sector. This is done through procedures which 
are not entirely rigorous but which can throw some light on the subject 
matter of the survey. The users of the data are provided with figures 
relating to both parts of the universe, along with a description of the 


limitations under which they were collected. (The GHS had to be 
limited to urban areas only, and all institutional households, such as 
hotels, boarding houses, and hospitals, were excluded.) 

The frame In order to cover the population decided upon, there 
should be some list, map, or other acceptable material (called the frame) 
which serves as a guide to the universe to be covered. The list or map 
must be examined to be sure that it is reasonably free from defects) If it 
is out of date, consideration should be given to making it up to date) It 
would be important to know how the list or the map had been made. 
(In the GHS the combined list of residential dwellings reported in the 
previous census and of residential dwellings erected since the census was 
used as the frame.) 

>/Sampling unit For purposes of sample selection, the population 
should be capable of being divided up into what may be called sampling 
units/) For example, a human population can be considered to be built 
up of villages, census enumeration districts, households, or persons§"fThe 
important point is that the division of the population into sampling units 
should be unambiguous. ) Every element of the population should belong 
to just one sampling unit. If, for example, the unit is a household, it 
should be so defined that a person does not belong to two different house¬ 
holds nor should it leave out any persons belonging to the population. 

is is not an easy task, since borderline cases will always arise, and some 
arbitrary rules will have to be framed to handle these cases. (In the 
V , a private household was defined as a person living on his own in a 
wg mg, or a group of persons permanently sharing the same dwelling 
aving common arrangements for the provision of at least one principal 



SAMPLING THEORY 


meaI . day Persons temporarily absent for less than 6 months were 
“dnded L y members of the household, while temporary vrsitors sharmg 

for Ipqq than 6 months were excluded.) . . , 

Sample selection At this stage the question of the size of the sample, 

the manner of selecting the sample, and the estimation °fpopul^on 
characteristics along with their margin of uncertainty are some of the 
technical problems that should receive the most careful J, . 

questions form the contents of samphng theory, with which this book 

primarily concerned.) . « 

The information to he collected The question of the kind of informa¬ 
tion to be collected should be considered at an early stage of planmng t e 
survey. Only data relevant to the purposes of the survey should be 
collected. If there are too many questions, the respondents begin to lose 
interest in answering them. On the other hand, it must be ensured that 
no important item is missing. A practical procedure is to prepare outlines 
of the tables that the survey should produce; this will eliminate irrelevant 
information and ensure that all essential items find a place. A major 
consideration would be the practicability of obtaining the information 
sought. Respondents may not be sufficiently informed to be capable of 

giving the right answers. ,, • f 

fo )(Method of collecting information The method of collecting the 1 

mation (whether by mail or by interview or otherwise) has to be decided, 
keeping in view the costs involved and the accuracy aimed at. Usually, 
one would prefer physical observation (if possible) to asking questions, 
interviewing respondents to mailing out questionnaires, etc.) Mail surveys 
cost less, but there may be considerable nonresponse. Interviewers cost 
more and there are interviewer errors, but without interviewers the data 
collected may be worthless. The problem is a complex one, and a solution 
taking into, account conditions pertaining to the particular survey must 
be found. In the annual industrial surveys in Greece, for example, ques¬ 
tionnaires are mailed out to establishments, and nonrespondents are 
interviewed by a special staff recruited for this purpose. (In the Greek 
Household Survey there was no real choice in the method of collecting 
information. Interviewers who would make daily visits to the sample 
households in order to record expenditure incurred by all members had 


to be used.) mv I 

Time reference and reference period /A decision has to be made con¬ 
cerning the time reference (the period to which the results of the survey 
will relate) and the reference period (the period for which information is 
collected from sample units) .J For example, in the GHS the time reference 
was 1 year, but the reference period for most items was 1 week (each 
household was required to provide information for just 1 week). The 





SAMPLE SURVEY BACKGROUND 


25 


sample was staggered over time (about one-twelfthof^he total number of 
sample households was interviewed each month) ¥7 The problem of the 
choice of the reference period is important. A shbrter reference period 
may give more accurate data, but a larger sample is necessary with this 
method, and this means increased costs. A longer reference period may 
be cheaper, but the information collected may not be so accurate, owing 

to memory failure, etc. j . , 

(B/The questionnaire 'or schedule The questionnaire (to be filled in by 

the respondent) or the schedule (to be completed by the-interviewer) forms 
a very important part of the sample survey. Having decided upon the 
data to be collected, the problem of their presentation requires considerable 
skill^ The questions should be clear, unambiguous, and to the point. 
Vague questions do not bring forth clear answers. Leading questions 
should be avoided. Since response may depend to some extent on the 
order in which questions are asked, the order of questions is another 
matter to be considered?- A small pretest always helps to decide upon an 
effective method of asking the questions} AH technical terms used should 


be properly defined. 

f$\ (graining of interviewers and their supervision The success of a survey 
using the interview method depends largely on the ability of the inter¬ 
viewers to elicit acceptable responses. Their selection and traimng is very 
important. Detailed instructions should be outlined for the proper train¬ 
ing of interviewers in the methods of measurement) Observation by a 
supervisor during the course of an actual interview is valuable for maintain¬ 
ing standards and for studying the interviewer’s adherence to procedures 

and tact in answering questions raised by respondents. 
r (b( Inspection of returns An initial quality check should be instituted 
while the interviewers are in the field to supply missing entries and correct 
apparent inconsistencies. Later, the clerical staff should make a careful 

review of the questionnaires received. . , . 

(g> ^Nonrespondents Procedures wifi have to be devised to deal with 

those who do not give information. The reason for nonresponse as well 
as any other information which can be conveniently obtained from the 
sample unit, should be recorded^ (In the GHS some basic data, such as 
size of household, were collected from households which refused t0 C00 P 
ate. This helped in assessing the effect of refusals on the characteristics 

of participating households.) . , • 

tfAnalysis of data When the data are transferred to mechanical equip 
ment for analysis, the errors involved; should be kept under control 
Since the machines, too, can make mistakes, mac me a f 

checked. And finally a report must be written that gives the findings ot 
the survey concerning those questions it was meant to ans 



26 


SAMPLING THEOfty 


2.6 JUDGMENT SAMPLING 

For the collection of information on a sampling basis, certain procedures 
are rejected outright by the survey statistician. This happens when it i 8 
not possible to find an objective method of distinguishing one procedure 
from another. To give an example, information could be collected inex¬ 
pensively by asking persons known as experts in the subject. These 
experts would no doubt differ from one another, and there is no objective 
method by which to distinguish the opinion of one expert from that of 
another. Another procedure belonging to this category consists of limiting 
the sample to units that appear to be representative of the population 
under consideration.* Information is collected on these units, and from 
these estimates of population characteristics are made. Here again the 
judgment of the person selecting the sample is significant, for different 
persons will; udge differently. There is no obj ective method of preferring 
one judgment to another. We cannot predict the type or the distribution 
of the results produced by a large number of judgment samplers, nor can 
we predict the manner in which these will differ from the so-called “true” 
value aimed for. We do not know any objective method of measuring 
the confidence to be placed in the results obtained when the sample is 
selected by judgment. The reason is that with such methods the proba¬ 
bility that a given unit will be selected into the sample is unknown. We 
are therefore unable to determine the frequency distribution of the esti¬ 
mates this procedure (of judgment sampling) will produce. In the absence 
of information on the manner in which different samples will differ from 
each other, the sampling error cannot be objectively determined. 


2.7 PROBABILITY SAMPLING 

The picture completely changes as soon as we begin using a sampling 
procedure in which every unit belonging to the population has a kn own 
and nonzero probability of being selected in the sample. With the help of 
probability theory we are then in a position to determine the frequency 
distribution of the estimates derivable from the sampling and estimation, 
procedure. We can calculate the proportion of estimates that will fall 
in a specified interval around the so-called “true” value aimed for. We 
know what results the repeated use of a specific sampling procedure will 
produce, which enables us to distinguish one procedure from another. 
, what is very important, a measure of the sampling variation (the 

mn ‘i example is quota sampling in which interviewers are free to choose their 

manv • e sample * shall contain so many men and so many women, so 

many persons of high mcome and so many of low income, and so on. 




SAMPLE SURVEY BACKGROUND 


27 


manner in which sample estimates will differ from the average) can be 
obtained objectively from the sample itself. The entire apparatus of 
probability theory and statistical inference (based on that) is available 
for drawing valid conclusions from the sample. Only procedures of proba¬ 
bility sampling such as these will be considered henceforth in the book. 


2.8 FORMATION OF ESTIMATORS 

In the general estimation theory appropriate to infinite populations, linear 
estimators are of the form 

ciyi + c 2 i /2 + • ‘ ’ + c n yn C 2 - 1 ) 

The quantities c u c 2 , . . . , c„ are constants attached to the observations 
made on the 1 st, 2 d, ... , nth selections (in the sample), respectively. 
In sampling theory dealing with finite populations in which the units are 
identifiable, linear estimators of a more general type can be considered. 
The coefficient to be attached to any selection may depend on the unit 
selected, the sample that produces this unit, and the order in which the 
unit appears in the sample. One may imagine the totality of ordered 
samples of size n from a population of size N and a coefficient attached to 
every unit in the population, depending upon the sample to which it 
belongs, and its position in the sample. The estimator (2.1) would be a 
particular case in which every sample has the same coefficient set attached 
to it. 


2.9 UNBIASED ESTIMATION 

Although any function of the observations could be used as an estimator 
of a population characteristic, only functions with some desirable proper¬ 
ties will be considered. The trend in sample survey theory is to use 
unbiased estimators as much as possible. With such estimators the 
expected value is the same as that which could be obtained from a complete 
count or census using identical methods of measurements. That it is a 
useful requirement to place on estimators follows from a well-known 
theorem in probability (see Sec. 1.12) which states that in a long series of 
repetitions of the sampling procedure, the average of the different values 
assumed by the estimator will be close to its expected value. Working 
with finite populations as we do, the criterion of consistency will ordinarily 
mean that the sample estimate equals the census count when the sample 
size equals the population size. Another reason for preferring unbiased 
estimators is that in repetitive surveys where estimates are made regularly 



28 


SAMPLING THEORY 


(say, every month), there is the problem of combining these estimates for * 
getting, say, annual figures. If the estimation procedure is biased, the / 
bias accumulates faster than the sampling error, which can be a dis¬ 
advantage of the estimation process. 


2.10 PRECISION OF ESTIMATORS 

The predsion or a measure of the closeness of the sample estimates to the 
r conditions >' is j“dged in sampling theory 

the kct that wil ‘ eSt r° rS COnCerned ' Here Klla -e Placed on 
from thet^tountt ,r:i\ th %C b S ***““ 

srrsss “irzs zzz 

,. probability) around the value aimed for With unhin^rl 

used £i ud r g the ^ 

th « d ‘f ib ution of the estimator is normal, 

of concentration. It is a partofeood var ! ance and the degree 

for which the distribution of thn +• f 1P mg practlce to use procedures 
reliance is placed on the centraU a PP™ a <*es normality. Here 

by which the distribution of a cum! 1 * 11 eo ^ em (®ec. 1*12) in probability, 
random variables gets large In approaches norm ality as the number of 
are of the form of su^ " the estimators generally 


2.11 BIASED ESTIMATORS 

We have no intention of suggestimr +w u- 

used at all. In K g. 2.!, th^dele of con^ eStimators sh °uld not be 
mates around the value aimed at (c 1 i.,. “ ce “ tratlon of the sample esti- 

' On me other hand „ g ** ^ than 

'' refes *• «•—. to the true value. 










SAMPLE SURVEY BACKGROUND 


29 

for the distribution U, although ft rWo . 

The probability that the sample estimates fii • Ce ?; te f at c °» and U does, 
larger in the case of B than w !th “ ln “ terval (a ' 6) 18 

estimator is preferable to the unbiased nnp a E Sltuatl0n the biased 
estimators are extensively used in samnr a matter of fact , biased 

object is to estimate a p«n 2 VZ') whe “ *« 

What criterion to use to distkeuish one I ' ‘ w ° £ questlons 0) 

“ 0 t W c f enter remain vaIi ^en the disXuon do* 

— :: 

ot the same as c„. It will be natural in such a situation to take deviations 

om c 0 itself and calculate the expected value of their squares. The 

quantity so obtained is called the mean square error (MSE). It is obvious 

hat for any estimator t, the mean square error around the value n would be 

1\ Term / i\ n f. v _ _ 


MSE (0 = m - m ) 2 - E[t - E(t)Y + [E(t) - M ]= 

= V(t) + [ft (*)] 2 

5(01 2 1 

<t) J j 


= V(t) 


1 + 


E 


( 2 . 2 ) 


where 7(<) = <r 2 (0 and B(fl stands for the bias associated with t. If t is 
unbiased, the variance and the mean square error would coincide. Of 
two estimators h and t 2 , the one giving the smaller mean square error 
around the parameter to be estimated will be preferred. Another way of 
looking at it is to assume that the loss involved in using t instead of 11 
is given by (t A*) 2 - The average loss then is E(t — /x) 2 , the mean square 
error around /x. An estimator giving a smaller average loss is preferred 
to another with which the average loss is higher. 


2.11.1 CONFIDENCE INTERVALS 


In order to estimate n , suppose a biased estimator W is used with ft (17) =m 
so that B(W) = m — n is the amount of bias present in 17 for estimating 
li. Assuming the distribution of W to be normal, the question raised 
in Sec. 2.11 is: What is the status of the confidence interval d: [W ± 
1.96<r(17)] based on the variance of W although W is subject to a bias of ft? 
The answer lies in computing the exact probability P 0 that the interval $ 
covers /x. We have 


P 0 = Pr[fi - 1.96<r(T7) < 17 < /x + 1.96a-(T7)]‘ 


= Pr 


-ft (17) 

°{W) 


- 1.96 < 


W -m —ft (17) 


cr(TP) 


< 


<W) 


] 


+ 1.96 (2.3) 


But (17 — m)/<r(17) is a normal variate with mean zero and variance 
unity. Hence P 0 can be calculated for each value of ft(T7)/<x(T7) using 


30 


SAMPLING THEORY 


the tables of the normal distribution. A few calculations are presented ) 
in Table 2;1. When W is unbiased, this probability is 0.95. As the ratio i 


Table 2.1 Probability of 4 covering m 


\B(W)\/a(W) 

Po 

\B(W)\/*(W) 

Po 

0.00 

0.9500 

0.10 

0.9489 

0.01 

0.9500 

0.30 

0.9396 

0.03 

0.9499 

0.50 

0.9210 

0.05 

0.9497 

0.70 

0.8923 

0.07 

0.9494 

0.90 

0.8533 

0.09 

0.9491 

1.00 

0.8300 


of bias to the standard deviation increases, this probability decreases. 
Provided \B(W)\/a(W) remains less than 0.1, this probability does not 
differ appreciably from 0.95. Hence, if we can prove that the bias asso¬ 
ciated with an estimator as a fraction of its standard deviation is smaller 
than 0.1, confidence statements can be made as if no bias were present. 
In case the bias B(W) is known, the exact 95 percent confidence interval 
based on a(W) would be f 

W - B(W} ± 1.96<r(W) 


2.12 THE QUESTION OF COST 

Sampling statisticians, dealing with problems in the real world as they do, 
take a very practical attitude in the selection of their procedures. Since 
every operation means cost, an attempt is made to use simple, straight- ’ 
forward procedures, procedures which can be completed within the time 
schedules, and which take into account all administrative requirements. 
Modern sample surveys are becoming multipurpose in character in the 
sense that information is collected on hundreds of items belonging to 
different fields of enquiry and that the results must be made available 
before they become out of date. As a result, it is not practicable to use 
many of the refined results in the general estimation theory. There is no 
time to examine the distribution followed by every item in the survey and 
to calculate, as an example, maximum-likelihood estimates using the fre - 
quency distributions ascertained. The sum and the sum of squares of 
the observations in the sample are the only quantities which could possibly 
be calculated in large-scale surveys. An estimator with a larger variance, 
but which is cheaper to handle, may be preferred to another which requires 
heavier computations, although its variance is smaller. 




SAMPLE SURVEY BACKGROUND 


31 


2.13 THE FUNDAMENTAL PRINCIPLE OF SAMPLE DESIGN 

With every sampling and estimation procedure is associated the cost of 
the survey aad the precision (measured, say, in terms of the mean square 
error) of the estimates made. Only those procedures are considered from 
which an objective estimate of the precision attained can be made from 
e sample itself. And the procedures should be practical in the sense 
that it is possible to carry them through according to desired specifica- 
tmns. Out of all these procedures of sample selection and estimation 
(called sample design), the one to be preferred is that which gives the 
ighest precision for a given cost of the survey or the minimum cost for a 
specified level of precision. This is the guiding principle of sample design. 


2.14 SCOPE OF THIS BOOK 

The scope of this book is limited to a description of sampling theory 
today. This theory does not point in a unique way to the best sample 
design for solving a particular problem. But it does provide a framework 
within which to think intelligently and produce effective methods. Not 
all the problems involved in the planning of sample surveys and their 
execution will be considered; it would be too ambitious to include them 
all here. Attention is restricted to different schemes of sample selection, 
formation of estimators of population characters, and an estimation of 
their variance from the sample itself. The situations under which one 
procedure is better than another are brought to the notice of the reader. 

The manner in which effective use can be made of all the relevant informa- 

* 

tion is indicated. It is explained how to make order out of the chaos that 
results when the many types of errors that a sample investigation is subject 
to are taken into account. 


REFERENCE 

National Statistical Service of Greece (1963). “Household survey of urban 
areas.” Government of Greece, Athens. 





chapter three 


BASIC METHODS OF 
SAMPLE SELECTION 


3.1 SIMPLE RANDOM SAMPLING 

This fundamental method of sample selection may be desenbal thus. 
From a population of N units select one by giving equal probability to 
all units This is best done with the help of random numbers (see 
Appendix 3). Make a note of the unit selected andTeturn it to the popula¬ 
tion If this operation is performed n times, we get a simple random 
sample of» units, selected with replacement (wr). If, however, this 
procedure is continued till n distinct (different) units are selected and all 
repetitions are ignored, a simple random sample of n units, selected with- 

^ i + /nr+r) obtained The latter procedure is exactly the 

out replacement (wtr), is obtained. F _ f nr th<*r unit 

same as retaining the unit (or units) selected and selecting a fo^erumt 
with equal probability from the units that remain in the population. 

Denoting the units in the population by 


U i, U 2> 


Us 


it is obvious that in wr sampling any selection produces 

Ui (i = 1, 2, , N) 



34 SAMPLING THE0 r Y 

with probability 1 /A for each unit. And all selections are independent 
since the selected unit is restored to the population before making the 
next selection. In wtr sampling, the first selection can produce any {/. 
with a probability of 1/A. Given that the first selection is U j} the condi¬ 
tional probability that the second selection will produce Ui (i ^ j) j s 
1 /(A — 1). But the chance of getting U, at the first selection is l/iy. 
Hence the absolute probability that Ui is selected at the second draw 
(selection) is 

V 1 1 1 

A A A — 1 “ A 

19 ** 


Similarly the absolute probability that a specified unit Ui is selected at 
the third draw is 1/A, and so on. Thus in wtr sampling, too, each selec¬ 
tion can produce any unit U% with a probability of 1/A. But the condi¬ 
tional probability that the sth selection produces Ui, when it is known 
that an earlier selection (say rth) has produced U, (i ^ j), is 1 /(A — 1). 
Hence the chance that selections r and s produce two units Ui and U, 
(order of selection ignored) is 2/[A(A — 1)J. A sample of n units gives 

pairs of selections. Thus the probability 7i\-y that units U\ and U 3 will 

appear in the sample of size n is given by -tru = n(n — 1)/[N(N - 1 )]. 
Also, the chance t, that unit Ui will be selected in the sample would be 
n/N, since the chance that it is selected at any particular draw is 1/A. 
The events “U, in sample” and “U) in sample” are not independent, since 
-irij 9 ^ T<7Tj. Again, given that units Un, Ut 2 , ■ • • , Ua are selected in this 
order at the first k draws, the conditional probability that a specified unit 
U. wiH be selected at the (k + l)st draw is 1/(A — k). Following this 
argument the probability of a specified ordered sample of n units is 
1/[A(A — 1) ■ ■ • (A — n + 1)J. Hence the probability of a specified 

sample of n units, ignoring order, is 

nl _ 1 

[A(A - 1) • • • (A - n + 1)] ~ (NJ 

Thus aU samples of size n have the same probability of being selected. 


3.2 ESTIMATION IN SIMPLE RANDOM SAMPLING 

For simplicity of presentation we shall assume that to each unit U% in the 
population is attached a variate value Y t for the character y. The popu¬ 
lation total is Y = S Yi, the mean being Y = 2F,/A. Let the n units 







basic methods of sample selection 


35 


(selected in this order) in the simple random sample be ui, m 2 , . . . , w„, 
with variate values yi, yi, . . . , y n respectively. We will prove the 
following theorem. 


Theorem 3.1 

In wtr maple random sampling, the sample mean 


Syi 

y = — 

n 

(3.1) 

is an unbiased estimator of Y and its variance is given by 



(3.2) 

where 


„ . 2(F« - F) 2 

S " ~ JV — 1 

(3.3) 


and f is the sampling fraction n/N. 


proof The random variable yi attached to the ith selection can 
have any of the values Y> (i = 1 , 2, . . . , N), each with a probability 
of l/N (Sec. 3.1). Hence Efa) = 2 Yi/N = Y. Using Theorem 1.1, 
which states that the expected value of the sum is the sum of expected 


values, we have 



Thus the sample mean is an unbiased estimator of the population mean. 
The variance of y is given by 

,m . m - ty - W 


n i 


n A 


By Theorem 1.1, E(Szi ) 2 = SE(zS) + 2 S'E&zf) where S' denotes sum¬ 
mation over the © different pairs in the sample. Now, 

22i 2 (N - 1)S„ 2 
N 


E(zS) = 


N 


Given Zi, the conditional expected value of z, would be 

E,< - = = n -1 



36 


SAMPLING THEORY 


since 


2z, = 2(y, - f) - o 


Hence, by Theorem 1.7, 


w = - = 

N — 1 


S2i 2 

AT(AT - 1) 


Thus 


^(») iv = -('l--W = (1 _ f>§al 

n N ' n N ‘ n V IV/ ’ U f) n 

where / = n/iV is the sampling fraction, the fraction of the population 
taken into the sample. a 

Corollary 

An unbiased estimate of the population total Y = NY is given by Y = Nii 
and V(Y) = iV 2 F(y). 


Remark The variance of y in the population is, by definition, 


ov = 


2(y, - ?)* 


(3.4) 


Thus the variance of the sample mean may be written as 

V(v) = [1 - (n - 1 )(1V - 1)-'] ^ 

n 

/ 

If the sampling fraction is very small so that n/N and ( n - 1) /(N - 1) 
are negligible relative to unity, we have V(y) = a v 2 /n = S v 2 /n. Then 
the variance of y depends solely on the sample size and the population 
variance and not on the population size N. 

Remark The quantity 1 - (n - 1 )/{N - 1) is called the finite popula¬ 
tion correction (fpc) and N/n the raising, inflation, or expansion factor. 

Remark One may think of choosing a sample size such that the coeffi¬ 
cient of variation <r(y)/E(y) of y has a specified value a 0 . (The coefficient 
of variation measures the precision of the estimator.) In that case, and 

provided that a good guess can be made of Y and S u \ the sample size » 
is calculated from the formula 


e-i)*- 


o, (?y 





basic mi thods of sample selection 


37 


Further 'reading It is of interest to note that it always pays to possess 
information on some units in the population. If the variate value of a 
unit is known, a simple random sample of n units from the remaining 
(N — 1) units gives a better estimate of the population total than a 
sample of n taken from the N (see Exercise 2). 


3.3 ESTIMATION OF SAMPLING ERROR IN WTR SAMPLING 

In order to obtain an estimate of V(y), we prove the following theorem. 


Theorem 3.2 

In wtr simple random sampling 

EM = s y 2 

where 

. , Sfa - y) 2 


(3.5) 

(3.6) 


proof We have (n — l)s „ 2 = £y t - 2 — ny % . Since 

V(y) = E(y 2 ) - [E(y)] 2 


we get 


E(y*) = V(y) + P = F ! + (1 -f) — 


n 


Also, 


W - § l rf 


Hence, (n - l)E(s/) = ^ ^ F, ! - (1 - /)«„> - nP 

= ^ (£ Y* - Np) - (1 - f)S„ 


n 


■jW- 


-o-j) 


S v 2 = (n~ 1 )Sy‘ 


which proves the theorem. 


\ Corollary 

s ^ 

Unbiased estimates of V(y) and 7(f) are given by (1 -/)-£• and 

— /) —, respectively. 
n 



SAMPLING THEORY 


" .. „ f the result (3.5) is the reason for the definition 

Remark divisor (ft -I)- 

of the quantity by wllu 

, o1 . _ w e W ith a case of probability sampling since 
Remark We are dea g (nonzero) probability of n/N of 

any unit in the It was stated in Sec. 2.7 that in probability 

samplingthe sample itself provides an objective estimate of the variance. 

Theorem 3.2 is a demonstration of that statement. 


Further reading See Exercise 16 for an alternative method of estimating 
the vari an ce. See Exercises 82 and 83 as well. 

f 

3.4 SAMPLING WITH REPLACEMENT 

If the sample of size n is selected with replacement, we shall prove the 
following result. 

Theorem 3.3 

In wr simple random sampling 

m = f V(y) = = (i _ IW (3.7) 

» V N/ n 

and y(yj ^ 

n 

w here £ stands for “is estimated by.” ' t 

general Theorem 3 4 e0rem can best given by first proving the more 


Theorem 3.4 

Let Ui (i = 2 \ , 

random variables with m h 1 ” l ^ Venimt and identically distrv 

m = via)= u ' *’ V(Ui) = ff2 - Let u= X u</m - 

m ' ^ mb *ased estimator of V(v)is given 

V(u) cr ^'( u < — u ) 2 
m (yt — i) 

proof Bv 

= 2F ^)M 2 - aVm^’the^t 5=5 = p. By Theorei 

t ms ^ch as Cov (ui,Ui) vam 




BASIC METHODS OF SAMPLE SELECTION 


39 


because of the independence of w, and uj. Further, 


ES(tti - u ) 2 = ZE{m 2 ) - mE(u 2 ) 

Now., E{ui 2 ) = V (ui) + [E(ui )] 2 = <r 2 + n 2 

And, E(u 2 ) = V{u) + [E{u )] 2 = ^ + M 2 

Hence E2(ui - u) 2 = m(tr 2 + n 2 ) - a 2 - my. 2 = (m - 1 )<r 2 

which proves the theorem. (This is a very powerful result, which can be 
used^whenever the random variables are independent.) ® 

In order to prove Theorem 3.3 we note that yi, V 2 , • ■ ■ , are 
independently and identically distributed. The variable yi associated 
with the *th selection can have any of the values Fi, Y 2 , ... , Y N , eac 
with probability 1/A. Hence E(y x ) = Y, 7(y<) = V- Setting m - y u 
m = n in Theorem 3.4, we immediately prove Theorem 3.3. 


Remark We note from Eqs. (3.2) and (3.7) that the variance of the 
mean in wtr sampling is smaller than the variance in wr samphng Hence, 
if the estimator to be used is the mean of the values of all units'm the 
sample, wtr sampling is preferable to wr sampling. But there is no 
appreciable difference if both 1/A and n/N are negligibly small as com¬ 
pared with unity. 


3.5 A GENERAL PROCEDURE 

A very general procedure of writing down the estimator and its variance 
will now be presented. This method will be used on several occasions m 
lus book Let U be the number of times the ith unit Ui (with variate 
vaL tin the population appears in the sample «f «. « howsoever 
selected. In wr simple random sampling, we have from Theorem 1. 

(Sec. 1.8), ' _ n 

Cov m = ^ 

In wtr random sampling, U is either zero or one. And 

n{n — 1) 

p r (U = 1) = n = £ Pri ~ u ~ x - *> = 1} = = N(N - 1) 


Hence we get 


m) = % = w 


7(h) = EW - [*W = ^( 1 ~n) 


n(n — 1 ) 

cov = sm - mm = w 



SAMPLING THEQRy 


In either case, the estimator to use is 

N 

X UY t 


y = 


n 


(3.8) 


where the summation is over all units in the population. The expected 
value of y is, by Theorem 1.1 (Sec. 1.3), 


m = 


2 EiU)Yi 


n 


(3.9) 


and its variance is given by (using Theorem 1.5) 

V(i) = h G WW) + 2 ll Y,Y, Cov (Ufy)] 


i j>\ 


(3.10) 


Whether the sampling scheme is wtr or wr simple random sampling we 
ee from (3.9) that y is an unbiased estimator of Y. Making^elevant 
substitutions m (3.10) we can verify that the variance of y is given by M 
for wtr sampling and by (3.7) for wr sampling. 7 ( ' 


M A BETTER estimator in wr sampung 

we have used theMtoSbtarf mall ? ** Simple ra " dom sample ’ 
It will now be proved that an selections, including repetitions. 

(Raj and Khamis, 1958) is superior* ti 1 -“ < 1. 0 “ the distinct units onl 7 

' superior. This is done in Theorem 3.5. 

Theorem 3.5 

the frequency with thick the Ithdi^t rand ° m sam P le °f «*« n. Let K be 

rth distinct unit occurs in the sample. Then 

= f J3.ll) 


where 


Vu = 


u 

s 

ra tL 
u 


Vt 


Vn = : 



““its is a Simple “its, the sample of distinc 

Pe selected without replacement. Henc< 
and therefore E *(Vu\u) = f 

■®(&») ~ Y fTho 

- theorem 1.7) 




41 


BAS ,C METHODS OF SAMPLE SELECTION 

. • for a given sample A. = (yi.yi, .... V«) of “ distinct units, the 

Ag i-uiiitv that a specified distinct unit y, will be selected at any selection 
(there being n such selections) is 1/u and therefore E 2 (k r \A u ) = n/u. 

He “ Ce ’ - (Vn)Sy,E t (k r \A„) = (l/u)Sy r = ff. 

Now, by Theorem 1.8 on conditional variance, we have 

V(y n ) = EiV2 (y n ) + Vi Ei(yn) 

= ExVtiSn) + v\vJ > V(g.) 

which proves- the theorem. 


Corollary 

By the same Theorem 1.8, 

V<s.) = WM.) 

Remark In order to get an unbiased estimator of V(y u ), we note that 
for w > 2, an unbiased estimator of S v 2 is provided by 

. S(Vi ~ VuY 

s “= u- — 

Thus, considering 


Si 

II 

(- - 1 

) + N'~ n ( 

i - -Yl s. 2 

l 

\u A 

7 V 

u) J 


we have 

E[G u \u > 2] = 7(£ u ) 

An alternative unbiased estimator is 

G: = [0 + - 1)(A1» - A0-] 

where s 2 = S(yi — y u ) 2 /(u — 1) for u > 2 and 0 for u = 1. 

Further reading 

J. For a realistic comparison between wtr and wr sampling schemes, 
see Exercise 17, in which the effective sample size (or total cost) is the 
same in both cases. 

2. If sampling with replacement is continued till the sample contains n 
distinct units, two estimators may be formed, one based on the distinct 
units only and the other based on all selections. For a comparison of the 
two estimators, see Exercises 1 and 95. 



4 2 


SAMPLING THEORY 


3.7 ESTIMATION OF PROPORTIONS 

We now turn to the problem of estimating from a simple random sample 
the proportion P of units of a population which belong to a class A (like 
the proportion male, proportion unemployed, etc.). If we associate with 
Ui, the ith unit in the population, a variable F, which takes up the value 
unity if the unit belongs to A and zero otherwise, it is easy to see that the 
total number of units belonging to A equals 2 F,- = F, and the proportion 
belonging to A is P — SFi/iV = F. Thus the problem of estimating a 
population proportion reduces to that of estimating a population mean 
by defining the variable y as above. K^nce no new principles are involved 
provided that we work with the new variate y for purposes of making 
estimates from the sample. Denoting by No and n 0 the number of units 
belonging to the class A in the population and in the sample of size n 
respectively, we have 

2F» = No = SFj 2 = NP Syi = no = Syf = np 

where p is the sample proportion. In simple random sampling, by 
Theorems 3.1 and 3.3, an unbiased estimate of the population proportion 
P is given by 

I 

a 1 71 o 

P = - Sy { = — = p (3.12) 

n n 


In wtr sampling, the variance of p, by Theorem 3.1, is 


smce 


SJ = 


2Fi 2 - (2F »)*/N [NP - (NP) 2 /N] 


N - 1 


N - 1 


And further 


F(} s i (1 j) m. - o/»)(flw 

n (n — 1) 

= 1 (i _ f) "pt 1 - p> = (1 _ a p(Ljip) 

n J n - 1 v (n - 1) 
In wr sampling, it is easy to see that 


(3.13) 


(3.14) 


F (p) = P(1 ZH = P( x ~ V ) 

. . n n — 1 

A W 

Remark The estimate of i\T 0 = NP ) the number of units belonging 
the class A , is obtained by multiplying the sample proportion p by 



basic methods of sample selection 


43 


Remark Provided the sampling fraction is negligible as compared with 
unity and (N — 1)/N = 1, we have that V(p) = P(1 — P)/n for wtr 
sampling. Thus the variance of p depends on the population proportion 
P and the sample size n . And P(1 — P) is maximum for P = for 
which value of P the variance of p is 1/(4n). On the other hand, the 
coefficient of variation of p is [P(l — P)/n] w /P = [(1 — P)/nP] Vt . This 
decreases monotonically as P increases from 0 to 1. 

Remark Suppose it is desired to estimate a population proportion with a 
coefficient of variation of a 0 or less. Then the sample size required to 
achieve this is given by the formula 

(1 - P)» < (nP)»a, or n> 

P Oo 2 

If the value of P is guessed as 0.05 and a 0 is 0.10, n > 1900. A very large 
sample size is needed if P is very small, viz., the item is rare in the 
population. 


3.8 SYSTEMATIC SAMPLING 


A more convenient method of sample selection when the units are serially 
numbered from 1 to A is the following. Suppose N = nk , where n is the 
sample size desired and k is an integer. A number is taken at random 
from the numbers 1 to & (using a table of random numbers). Suppose 
the random number is i. Then the sample contains the n units with serial 
numbers i, i + k, i + 2k, . . . , % + ( n - l)fc. Thus the sample con¬ 
sists of the first unit selected at random and every fcth unit thereafter. 
It is therefore called a systematic sample (with k as the sampling interval), 

, and the procedure of selection is known as systematic sampling. The 
convenience of selection lies in the fact that the selection of the first 
member of the sample determines the entire sample automatically. The 
first point to be noted about this procedure is that, for a given numbering 
of the units, we are in effect selecting with probability 1/k one group or 
cluster of units from the following k clusters forming the entire population: 


Cluster 


Composition of cluster 




-4 


SAMPLING 


44 


„ , rr in the noDulation belongs to one and only one cluster. I 
SSmM cluster is 1/A which is therefore the probabj J “ 
S which any member of the cluster is selected m the sample. Tt , 
1/k . This shows that systematic sampling is a probability sarnpl;," 
procedure. The chance that two units U, and U, belonging to the sam* 
cluster are in the sample is obviously given by = 1/k Howev et 
= o when the two units referred to belong to different clusters, j ’ 

^ i _4-Varv nluo+nrC TUTlII PYVnf.flin *1 . ** 


tt -‘ = 0 when the two units reierreu tu ^u 8 — --waters. j. 

case N = nk + c, c < k, some of the clusters will contain n units whi] e 
others will contain n + 1 units; i.e., cluster sizes will not be equal. 
the probability that a given unit is selected into the sample (of size n 0r 
n + 1) will still be 1/k, since one cluster is picked up at random from the k 


3.9 ESTIMATION IN SYSTEMATIC SAMPLING 


VI 

/: 


Theorem 3.6 


In systematic sampling with interval k, an unbiased estimator of the popula¬ 
tion total Y is provided by ^ 

? = k S Va = kGi (3J5) 


where G, = S Va is the total of the sample cluster. And 
i 


^f) : 


(3.16) 


PROOF The random variable kGi, attached to the sample cluster, / 
takes up the values kG x , kG 2 , . . . , kG k , each with probability 1/k. 


Hence 


E{f) = 7 


Further, by the same argument, 


nt) = E m -Yy = li (kGi _ Y)2 = k ^ Gi _^ 


Corollary 

The population mean will be estimated by ? /N . 

Remark In the derivation of The 

that all clusters must contain ° rem ^ no ass umption has been made 

tne same ^mber of units. 






45 


BASIC METHODS of sample selection 

Remark If the numbering of units in the population is changed, different 
clusters will be formed as a result of the selection procedure. The cluster 
totals Gi will be different, on which depends V(f). Thus the variance 
associated with systematic sampling depends heavily on the manner in 
which the units are arranged in the population at the time of sample 
selection. This is in marked contrast with simple random sampling, in 
which the arrangement of units had no part to play. 

Remark One cannot infer from Formula (3.16) that the variance in 
systematic- sampling will surely decrease if the size of the sample is 
increased. There is no guarantee that a larger sample will necessarily 
produce Gi that are less variable. This makes it clear that systematic 
sampling is a delicate tool if it is used solely for the purpose of achieving 
higher precision. 


3.10 AN ALTERNATIVE EXPRESSION FOR THE VARIANCE 

An alternative expression for the variance of the systematic sample esti¬ 
mate, which is more instructive, is given below. 


Theorem 3.7 

In systematic sampling, with a sampling interval of k, from a population of 
size N = nk, the variance of 

f = Nyi 

is given by 

V(f) = -— £„ 2 [1 + (» — 1 )p] ( 3 - 17 ) 

n 

where y, is the sample mean and 

_ ~ Y)(yik ~ £jj (3.18) 

p - E(3Hi - YY 

is the intracluster correlation coefficient defined in Sec. 1.9. 


proof From Theorem 3.6 we have 


jot ,-.*)•-*2 [2 <*-«]* 

* i i i 


= k 22 ( v<i -Yy + 2km(ya- Y){y* - 

i i k>j 

= HN - 1 )SY + k(n - 1)(AT - 1 ) P S V * 


= N(N - DSy 2 



1 + p(n ~ !) m 


n 


SAMPLING THFo 


RY 


4C 

Corollary 

The 
in wtr 


or 


P < ~ 


1 


N - 1 


(3.19) 


If N is large, this requires that p be negative if systematic sampling is to be 
superior to simple random sampling. 

Remark The quantity p is the correlation coefficient between pairs of 
units in the same systematic sample. Since there is no algebraic relation¬ 
ship between n and p, the performance of systematic sampling, as n is 
increased, becomes unpredictable. 

Remark If the ordering of the units in the population is essentially 
random, a sample of any n predesignated positions will be a simple random 
sample. Hence, systematic sampling in this situation coincides with wtr 
simple random sampling. The same conclusion follows from the observa¬ 
tion that p = — 1/{N — 1) when the units are arranged at random. 


3.11 ESTIMATION OF VARIANCE IN SYSTEMATIC SAMPLING 

Since the variance of ? in systematic sampling is k 2 times the variance 
of Gi (Sec. 3.9), an unbiased estimate of F(?) cannot be obtained by 
making just one observation on Gi, that is, by taking just one systematic 

. ., i _ a rigorous estimate of the sampling 

variance, it can be best done by taking not one but m independent system¬ 
atic samples, each containing n/m units to keep the total sample size 

T!° rem . 3 ' 4 Wmth “ provide an biased estimate of V(t)- 

random ih *r ^ i° U { m * S * n tke Population can be considered to be 

Zl ; ‘d f fM variance in the case of wtr 

simple random sampling will apply But mii+o i j rim to 

arrange the units in a particula^-pattern Quite often one would like to 

For example, the city blocks may b^rmn Jd° ^ 

cally contiguous) -fashion and a L,Lm ! E d a serpentme <* eo 8 rapl “* 

a fair spread of the sample over the whnl S . ampl ® , of blocks taken ge 

of introducing negative correlation n, t i° CIty ‘ Thls wl11 have the efteCt 

ithin systematic samples. Since the 


.... no 61 ,ir > 

tr simple random sampling is that 

N(N _ , w i+i^ < m - *) s i 


'T > 






basic methods of sample selection 


47 


units are deliberately ordered, the formulas for estimating the variance 
appropriate to simple random sampling will not apply. However, the 
position is not as bad as it looks. To anticipate matters, systematic 
sampling is generally used in large-scale surveys at the last stage of the 
sampling process (to select households, etc.) and the results of Chap. 6 
will show that in this case rigorous estimates of the variance can be 
obtained. 


) / 

3.12 SAMPLING WITH UNEQUAL PROBABILITIES 

! / In the two schemes of sample selection discussed so far, every unit in the 
population had the same chance of being selected in the sample. The 
reader might get the impression that this is essential to the argument, 
i It is not so. The only requirement is that the probabilities of inclusion 
should be known and should be nonzero. It can, in fact, be demonstrated 
$ that higher precision may be achieved by making the probabilities unequal. 
Now the question is: What is the procedure of selecting the sample with 
unequal probabilities? We shall take the concrete case of a population 
!' of N agricultural holdings with areas X i} X 2 , ... , X N) the total area 
i being X = A sample of one holding is to be selected such that the 

r chance that the ith holding will be selected is Xi/X. If the X{ are integers, 

• all we have to do is to assign the first Xi natural numbers (1 to Ai) to 
the first holding, the next X 2 numbers (Xi + 1 to Xi + X 2 ) to the second 
holding, and so on. A number is then selected at random between 1 and 
X (with the help of a table of random numbers) and the unit in whose 
range this random number falls is taken in the sample. It is evident that 
the probability that U ,• is selected is Xi/X. The X/s are called measures 
of size of the units, and the procedure is known as sampling with proba¬ 
bility proportionate to size (pps). It may be noted that the measures of 
size can all be multiplied by a constant number to make them integral if 
they are not already so. This will not disturb the probabilities. If the 
above procedure of selection is repeated n times with the precaution that 
the unit selected is restored to the population at every selection, we get a 
Pps sample of size n selected with replacement. 


V. k ^ 

3,13 AN ALTERNATIVE 


SAMPLING PROCEDURE 


n the method just described the measures of size Xi, of the units have to 
y cumulated progressively in order to assign them ranges of the type 
1 "+■ 1 to Xi -f X 2 . If N, the total number of units, is fairly large, the 
Process of cumulation will become tedious. A procedure of selecting a 


SAMPLING 


theory 


dds sample has been devised by Lahiri (1951), in which no cumulate 
2db. made. This consists in selecting a number at random between 
“d N and noting down the unit selected provisionally. Another rand 01n 
number is then taken between 1 and X 0 , where X 0 is the maximum ( 0r 
something greater) of the N measures of size. If the second random 
number is smaller than the size of the unit provisionally selected, this 
unit is finally taken into the sample. If not, the entire procedure. ( of 
selecting two random numbers) is repeated until a unit is finally selected 
In order to prove that this procedure gives a pps sample of size one, We 
note that the chance that a trial (consisting in taking two random numbers) 

will end in no selection is given by q = X (!/#)(! — Xi/X 0 ) = 1 - X/X^ 

(The probability of selecting Ui provisionally is 1/N, and the probability 
that the second random number exceeds X is 1 — Xi/X a.) The chance 
that the unit Ui is selected at a trial is = (1 /N)Xi/X 0 . Hence the 
chance that the sample of one will finally end up in the selection of the 
unit Ui is pi + qpi + q 2 p { + • • • = p</(l - q) = Xi/X. 


1 


y 


3.14 ESTIMATION IN WR PPS SAMPLING 


Let pj = Xj/X be the probability that the unit Uj is selected in a sample 
of one. If n independent selections are made and the value of y for each 
selected unit is ascertained, we have the sample 


flji) 2/2; • • • ; 2/n \ 
\Pl) p2y • • • ; Pn/ 


The random variable y, associated with the ith selection can have values Y i, 
2 ’ ' ' ’ ’ F "> with Probabilities p hPll . . . Hence E( yi / Pi ) = Y 


V(Vi/Pi) - ^ pi — y'J . Since the random variables y</P< 

(o. = 1 O — \ i 


✓ . 1 * / 

bv usw’tVI ' ’ "IT independentl y and identically distributed, we can, 
by usmg Theorem 3.4, ^mediately prove the following theorem. 


j Theorem 3.8 

PPS « biased estimator of the populate 


with 


^ — ~ ■ — 5 

n pi 


n 


(3.29) 


V(?) = I 


Mi-*) 


* 


(3.21) 


basic methods of sample selection 


49 


and 



1 

n(n —1) 


Sfa — z ) 2 


(3.22) 


Remark The expression for the variance of f can be easily given by the 
following alternative form: 



(3.23) 


where X denotes summation over all different pairs of units in the popula¬ 
tion. Furthermore, 


n 2 (n — 1) 


(3.24) 


/ 

where S denotes summation over the different pairs in the sample. 


Corollary 

We note from (3.21) or (3.23) that 7(f) = 0 if yJvi is constant. This 
shows that the pps estimate will have zero variance if the measures of 
size are such that the variate y is proportionate to x. It is on this 
proportionality or near-proportionality that the survey statistician relies 
when the method of pps sampling is decided upon. It is true that the 
values of y are not known in advance. But if we have reason to believe 
that the measures of size chosen are such that y/x is approximately 
constant, we have reason to expect from (3.21) that the variance of the 
estimate will be small. 


Remark The method of sampling with unequal probabilities is generally 
used for the selection of large units such as cities, villages, and blocks. 
The measures of size are usually based on information collected from 
censuses of population, agriculture, industry, etc. 


Remark 


Another way of writing the variance of f is 



(3.25) 


Further reading See Exercise 16 for another method of obtaining the 
variance estimator. 


SAMPLING THEORy 
50 


^/comparison with sampling with equal probabilities 

It must be stressed at this stage that the success of p ps sampling depends 
heavily on the goodness of the measures of size. If these are poor, m the 
sense that near-proportionality does not exist, it may be no better than 
sampling with equal probabilities. In fact, a comparison of the variances 

_ irv *> 


A X2YS/X, - F ! ^ _ 

Vi(f) =--- V * Y) ~ 


N2 Yr - Y* 


for pps and wr equal probability sampling, respectively, shows that the 
former will be superior if 

T (Xi - x) Y > 0 (3.26) 

X 

that is, if x and y 2 /x are positively correlated (Raj, 1954). The applica- 
tion of this criterion is difficult in practice. Another point to be borne in 
mind is that the correlation coefficient between y and x may be unity, and 
yet pps sampling may be worse than sampling with equal probabilities. 
This is brought out in the following theorem (Raj, 1954). 


Theorem 3.9 


If for a finite population y = a + bx, so that there is perfect correlation 
between y and x, pps sampling will be less precise than equal probability 
sampling if 


X - X ^b 2 - N 

Xa x 2 > a 2 X ~ S(I7 Xi) 


(3.27) ' 
J 


The proof follows from the substitution of y = a + bx in the inequality 
given by (3.26). If a is large, the inequality (3.27) may be easily satisfied. 


Further reading For a comparison between pps sampling and equal 
probability sampling when the finite population actually observed is 
assumed to be a random sample from an infinite superpopulation following 
a certain model, see Exercises 4 and 5. 

J 


3.16 


iul( 


SAMPLING WITHOUT REPLACEMENT WITH UNEQUAL PROBABILITIES 

f neralizatio ? of ‘he wr sampling scheme at Sec. 3.12 wo 
unit fr™ th PPS ^ w SUe Unity as before “d remove the selecte 

samnle°of £ P ° PU ' a T the units that remain another PI 

sample of size one is taken and the selected unit is removed from “ 




basic methods of sample selection 


' SI 


population. This procedure is continued till n selections are made. This 
• will give a sample selected without replacement with unequal probabilities. 
Jii n = 2 and X ( /X is denoted by p t (i = 1, 2, . . . ,N), the probability 
that the unit Ui is in the sample would be 

= Pi + £ = Pi [l + £ ViiX ~ Py)' 1 ] 

Further, the probability that units U t and Uj are both in the sample is 
= P*Vi(X - Pi)~ l + PjP*(X - pi)- 1 = PiPj[(l - pi)- 1 4- (1 - py) -1 ] 

3.17 A MORE GENERAL SELECTION PROCEDURE 

A more general sampling scheme is to start with some arbitrary proba¬ 
bilities pi (i = 1,2, . . . ,N) for drawing the first member of the sample. 

, Depending upon the first selection, we can make an arbitrary assign- 

- 1) units for making the 

sets of conditional proba¬ 
bilities. Similarly, for each pair of units selected in the first two selections 
we can specify arbitrary conditional probabilities for the remaining 

(N — 2) units. We will thus have ffl sets of conditional probabilities 

for making the third selection, and so on. Theoretically, the probability 
of every ordered sample (U h U 2 , . . . , U n ) can be written down as the 
product of Pr(Ui), Pr{U 2 \U x ), Pr{Ut\U x ,U t ), . . . , Pr(U n \U h . . . , 
Un-i). Consequently, the probability that any unit Ui is selected in the 
sample can be obtained as the sum of the probabilities of all samples of 
size n containing £/,. Similarly, x.y is obtained as the sum of the proba¬ 
bilities of all samples containing both Ui and Uj. 


ment of probabilities for the remaining (N 

second selection. This will give rise to 

* 



3.18 ANOTHER TYPE OF SELECTION PROCEDURE 

A generalization can also be made about systematic sampling with equal 
probabilities. We cumulate the measures of size of the units and assign 
them the ranges 1 to X x , Xi —|— 1 to X\ •+■ X 2 , Xx + X 2 4- 1 to X\ 4- 
X 2 -f- Xz, and so on, as in Sec. 3.12. In order to select a sample of size n, 
a random number is taken between 1 and k = X/n. The units in the 
sample are those in whose range lie the random number i and all other 
numbers i -f- k, i 4- 2k, . . . , obtained by adding k successively to i. If 
there is any unit whose measure of size > X/n, it is removed beforehand 



52 


SAMPLING THEORY 

from the selection procedure and is taken into the sample with cer¬ 
tainty. The probability that any unit U% is in the sample is obviously 

Xi/(X/n) = npi. There is no simple formula to write down an expression 

for vn. For a specific arrangement of the units, this could be easily 
calculated by finding out which random numbers (from 1 to X/n ) will 
select Ui and U) simultaneously. If my is the number of such random 
numbers, = nma/X. 


3.19 ESTIMATION PROCEDURES 


We thus find that there is no dearth of procedures for selecting the sample 
with unequal probabilities without replacement. (We shall study a few 
more later on.) Our next problem is that of estimation. Suppose we 
attach constants C\, C 2 , . . . , cu to the units U\ } Ui, . . . , Un, respec¬ 
tively. A very general linear function of the sample values can then be 
written as 

L(s) - £ UaYi (3.28) 


where U’s are random variables defined as U = I if Ui occu^p in the 
sample and 0 otherwise. Obviously, E(U) = m (the probability that 
■Ui occurs in the sample) and E(Utj) = nm (the probability that Ui and 
Uj both occur in the sample). 


Thus V (ti) = Ti — 7Ti Z = 7T,(1 — 7Ti) CoV (ti,tj) = 1 T.y — ITiWj 

Now, EL(s) = 2 Ci YiE(ti) = 2? naYi 

In order that L(s ) be an unbiased estimator of Y, we must have 

EL(s ) =2 Yi 

which gives c, = 1/x,-. Thus 

UYi 




and 


^ ni n 


(3.29) 


r(f) = I^W + n77^(w) 

i j*i 

= ^ — ~ .f y y ~ YiY j (3.30) 


IT 


* 


TilTj 


This expression for the variance is due to Horvitz and Thompson (1» 52) ' 
Remark The variance of P depends solely on the quantities TO »° d 




basic methods of sample selection 


53 


' ir . calculated from the sampling procedure adopted. This is the reason 
that these probabilities were determined for the various selection proce¬ 
dures proposed in the preceding sections. 

Remark If the selection procedure is such that n * Y if the estimator 
(3.29) reduces to a constant, and thus has zero variance. Hence, in 
practice, we search for measures of size Xi « and try to have a selection 
procedure based on the Xi such that v ,• oc Xi. 

Further reading When the number of units within the population is 
small, it may be possible to determine the m so that the variance of the 
estimator (3.29) is minimized. This is the subject matter of Exercise 8. 


3.20 ESTIMATION OF VARIANCE 

Let f(y ) be a function of y. We shall define Li(s) and L 2 (s) as 

Li(s) = Sdf(yi) = Ztidfiyi) 

' (3.31) 

I/ 2 (s) = S'c i jf(y x )f(y j ) = £ tijCijf(yi)f(yj) 

where c*/s are determined beforehand for all pairs in the population, and 
Uj = 1 if both Ui and U, are in the sample, and 0 otherwise. By the 
argument given in Sec. 3.19, we have 

ELi(s) = J TiCif(yi) (3.32) 

EL 2 (s) = J TTijCiCjf(yi)f(yj) (3.33) 

This shows that the expected value of any random function of the type 
Li (s) is obtained by multiplying by tt ,• and summing over all units in the 
population. And the expected value of a function such as L 2 (s) is obtained 
by multiplying by ir*/ and adding over all pairs of units in the population. 
Now we are in a position to estimate ^’(1^) of Eq. (3.30). The first part 
is estimated by Syi 2 ( 1 — ?r.-)/7r; 2 and the second part by 


2 S'(lTij - TT i7Tj) (t*/) 


Vi Vi 

TTi TTj 


Adding, we have an unbiased estimator as 


. . 1 — 7Tt Kij ^*^3 Vi 

^(f) =S -—* yf + IS’ J - 


T , 2 " ' «»/ 


This estimator is due to Horvitz and Thompson (1952) 


(3.34) 



SAMPLING 


Th, EORy 


Remark In order to obtain an unbiased estimator of the variance, th e 
sampling scheme must be such that all > 0. 


3.21 TWO USEFUL RELATIONS 


The quantities x, and are probabilities, and so they lie between 0 and 1 
They are subject to the following further relations 


= n 


2 TTij = (n — l)x< 
Mi 


(3.35) 


The proof is simple. For the totality of samples s of size n, 2 Pr(s) == j 
Now Xin is the sum of the probabilities of all samples containing C/j J 
all samples containing U 2) and so on. Thus every Pr(s ) occurs n times 
in this sum, once as a sample containing the first member in it, then as 
sample containing the second member in it, and so on. Hence & 

In the same way J xiy is the sum of the probabilities of all samples 


containing Ui and U 2 , Ui and U 3 , Ui and U i} and so on. Thus everv 
Pr(s) containing U x occurs (n — 1) times in this sum as the sample has 
(n — 1) other members in it, and it occurs once for each of these members 
Hence J *„■ = . (n - 1) Tl . In general, J x* = (n - 1) T< . 


3.22 AN ALTERNATIVE EXPRESSION FOR THE VARIANCE 


We shall now exhibit the variance of the estimator (3.29) in wtr samnline 
m a form smnlar to (3.23) obtained for wr sampling. We have 


Now 


* r ✓ ^ ** T/J 


' r y n i n jj 

■ -n [<•«-..) g 




- (n -1) y 


Hence 




F(f) = 


■ i [<•« -..) (* - 0 ] 


1 


(3.36) 





basic methods of sample selection 

Using the relation (3.33), an unbiased estimator of V(t) is given by 

™•»> 

This expression was first given by Yates and Grundy (1953). 


Remark In the case of with-replacement pps sampling, it is clear from 
(3 23) that all pairs of units in the population make a positive contribution 
to the variance. This is not necessarily so in the case of without-replace- 
ment pps sampling, as is evident from (3.36). All those pairs for whic 
x.iry < lea will make a negative contribution to F(f). As a result, the 
estimator of variance (3.37) or (3.34) may assume negative values for 

some samples. 


Remark When sampling is carried out without replacement with equal 
probabilities, *«, = n(n - 1)/[N(N - 1)], * = n/N. Substituting these 
in formulas (3.29), (3.30), (3.36), and (3.37), we derive the customary 
population-total estimator, its variance, and its variance estimator. In 
this case w - tt„ > 0, and all pairs make a positive contribution to the 

variance. 


Further reading 

4 

1. It may be of some interest to use the simpler variance estimator 
(3.22) in place of (3.34) or (3.37). The status of this estimator in wtr 
sampling is discussed in Exercise 6. 

2. Some well-known situations in which the variance estimator (3.37) 
must be positive are presented in Exercises 7 and 9. 


3.23 COMPARISON OF WTR AND WR SCHEMES 

When is wtr sampling superior to wr sampling with unequal probabilities? 
An answer to this question will now be attempted. We start with a set 
of probabilities p< (* = 1 , 2, ... , N), 2= 1, with which a pps sample 
of size n is selected with replacement. Let there be a wtr sampling scheme 
for which n = np x (this could be achieved, for example, by the method of 
Sec. 3.18). Then we prove the following theorem (Raj, 1966). 

Theorem 3.10 

A sufficient condition for the wtr estimator S{y x /ic x ) to have smaller variance 
than the wr estimator n~ l S(y%/pi ) with ir x = W P«, independently of t e y s 


SAMPLIMq 


Th EO*n 


is that 


C n — 1 ) 

*a > —~— irfiTj for al] i,j 


(3.38) 


PROOF 


Hl-H-'Ci-Sl 


since t i = npi. Hence, if 


T *"V — TTij < 


TTxITi 


ZdS With0Ut repI “ wU1 b * This leads to the 


'«>(»- 1) f for all ij 


Corollary 


In sampling without replacement with equal probabilities 


asso- 


*■»>• n(n — l)[JV(iV — i)]-i 7T- = x — n 

Now -Kij[(n — l)x,Ty/n]~l = N/(KT __ 

mated wr scheme is inferior. 1} > *' This proves that the asso- 

“ ° btaiD ^raTn" d lS ta' ThZreml.U !° *" *° Wr 

Theorem 3.11 

wtr S( V .U, thhl t 

v(yi/pi) with x, =s ni) . • j , be better than the wr 


independently of the y’s, 

v £ fciW, 


proof Let 


(3.39) 







basic methods of sample selection 


57 


\ 

- 


Then 


or 


or 


2 ~ + 2 ^ (ira — —) — K ! < Y — — — 

^ ** ^ »;/ “ ^ Ti n 

2 2 {*«*<S - " _1(n - » (2 ^ + 2 2 My) 

^ Vi 2 + ^ ^ — 0 X.J = 1 — ?l7Ty[(n — l) 7 r,- 7 T>] -1 


* J;*t 


It follows then that the principal minors of the matrix A = (X,-,) with 
X« = 1 are nonnegative. Thus 1 - X 0 - 2 > 0, which leads to (3.39). ■ 


Corollary 

In samples of size n = 2 the estimator of variance (3.37) will be positive 
if wtr sampling is superior to with-replacement pps sampling independently 
of the y’s. 


Further reading A wtr sampling scheme which is definitely superior to 
the associated wr sampling scheme from the point of view of variance is 
presented in Exercise 10. 1 


3.24 ANOTHER PROCEDURE OF ESTIMATION IN WTR SAMPLING 

For the more general wtr sampling scheme of Sec. 3.17, in which condi¬ 
tional probabilities are specified for each selection, an alternative estima¬ 
tion procedure consists in making direct use of conditional probabilities 
x without calculating ir, and ir iif which may be difficult to compute for some 
sampling schemes. In this procedure the expectations are calculated by 
making use of the conditional argument. As an example, suppose two 
units are selected from a population in the following manner. The first 
selection is made with probabilities p, (i = 1 , 2, . . . , N) based on 
and the second selection is made with probabilities proportionate to the 
sizes of the remaining units. In this situation, the conditional probability 
that Uj is selected when it is known that 17,• is the first selection is given 
' ky Vi/ (1 — pi). We form the estimators 

, _ Vi , , 1 - Pi 

h — ^2 = 2/i H“ 2/2- 

Pi Pz 

> where y 1} y 2 are the variate values associated with the first and second 

ections and pi, p 2 are the corresponding initial probabilities. Now 

( ‘ x) = *(yi/Pi)Pi = Y,E 2 (t 2 \h) = yi + (Y — yt) = F,sothat£(* 2 ) = Y. 

Un j ®° me other procedures of selecting a sample of two different units from a 
verse such that m « £,• are outlined in Exercise 102. 



« sampling THeo,, I 

Thus (i and ti are unbiased estimators of Y. Again, we have 

from (3.25) 


v(h )=£ [«>■ £ - 1 ) : ] 


I 


And V(t 2 ) = EiV 2 (t 2 ) + ViE 2 (t 2 ) by Theorem 1.8. As E 2 (t 2 ) ^ y | 
ViE 2 (t 2 ) = 0. Hence we have V(t 2 ) = E x £ - yj/x : )\ where 

Jc 

the {^2 pairs ^ orme< ^ out (N — 1) units eliminating j 


£ runs over 

k 

Uk, the one selected at the first selection. Thus 


/ / / 

V(t 2 ) = ^ cm = ^ ^ + 




Further 


-£[(>-- SI <-« 

E{t\t 2 ) — 2 (^21 ^i) ] — F 2 


> 


so that t\ and t 2 are uncorrelated. Hence t = (ti + t 2 )/2 is an unbiased 
estimator of F and 


T7/j\ V(h) + V{t 2 ) „ V(t0 

W ~ 4 < 2 


We have then proved the following (Raj, 1966) Theorem 3.12. 


J. 

t 


Theorem 3.12 

. . j 

In samples of size two , let the first selection be made with probability j 
proportionate to X { (i = 1, 2, . . . , N) and the second , with VP to the 
remaining Xi. Let 


Pi 


^2 — 2/i-l- y 2 


1 - pi 


t = 


t\ + 1 2 


V’l 


\ 

* 


Then E(ti ) = E(t 2 ) = E(t) = F, 

V(h) < V(t,) V(t) < V L-'S ^) 


This theorem provides immediately an example of a situation 


in N 


basic methods of sample selection 


59 


without-replacement pps sampling is superior to with-replacement pps 
sampling. This result can be extended to any sample size n. 

Theorem 3.13 

Suppose a sample of size n is selected in the manner of Theorem 3.12, that is, 
the ith selection is made with probabilities proportionate to the sizes of the 
remaining N — i + 1 units. Define ti = yi/pi, 

1 “ .2 Vi 

h = Vi + 2/2 H" * ' ’ + V\-i + V\ - - —-— , (X = 2 , ,n) 

Then E(Jh) = Y E(Up) = Y 2 V(h) < Ffo-i) 

proof Given the units selected at the first (X — 1) selections, the 
conditional expectation of t x is 

Vi + 2/2 + ‘ • * + 2/x—i + (Y — yi — y 2 — • * * - y\- i) = Y 

so that t\ is an unbiased estimator of Y. If X < n, given the first (u — 1) 
selections, E(tf) = Y so that E(t\tf) = Y 2 , which proves that t\ and are 
uncorrelated. Further, given the units selected at the first (X — 2) selec¬ 
tions, it is proved from Theorem 3.12 that the conditional variance of 
t\ is smaller than that of t X - 1 , and hence F(£ x ) < F(fc-i). ■ 


Corollary 

We have V(t n ) < F(/»_ 1 ) < • < V(h). Defining 

t = ^ ‘ + t n ) 

n 

we get V(J) = (1 /n*)2V(U) < (l/n)V((,) = V (n~'S 
This proves that the wtr scheme is superior to the wr scheme. 


Further reading 

1. Refer to Exercise 18, in which an equivalent procedure of sample 

tfiui, 1011 1S cons ^ er f^ - this method sampling with pps is continued 
e sample contains (n + 1) different units. The last unit is rejected, 
and the sample consists of the n different units selected. A comparison 
oi tne two situations is made in Exercise 19. 

dnn a ^ erna ^ ve me thod of forming estimators when sampling is 

^one without replacement with unequal probabilities, see Exercise 12. 

thp> ^ ^ * s an ex ample of an ordered estimator, making use of 

Exerci ^ ^ 1Ch un ^ s are se ^ ec ted in the sample. It is proved in 
at an unordered estimator is superior to an ordered one. 

41 



so 


SAMPLING THEORY 


3.25 ESTIMATION OF VARIANCE 

An unbiased estimator of the variance of t = 2 U/n is proved by the 
following general theorem. 

1/ 

Theorem 3.14 

Let t h t if ... ,t» be uncorrelated random variables with the same expects 
tion E(U) = ix. ’ Let t be defined as t = (<1 + <2 + * * * + Then 

E(t) = n and an unbiased estimator of V(t) is given by 

f (t) = 2(k - t) 2 /[n{n - i)] 

proof It is obvious that E(t) = ix. As U and tj are uncorrelated, 
E(Utj) = m 2 , and hence £ £ Utj/[n(n - l)/2] is an unbiased estimator 

i j>* 

of /i 2 - Now 

V(t) = E{t 2 ) - m 2 

? (2(J ! „ 22W, 2((i - t) ! _ 

Hence ?(0 = - 2 - -^r 2 „(» - 1) »(»-!) 


Remark The proof of this theorem shows that it is not necessary to 
assume in Theorem 3.4 that the variances of the random variables Ui are 
the same and that the variables are necessarily independent, if all that is 
wanted is an unbiased estimator of the variance. 


REFERENCES 

Horvitz, D. G. and D. J. Thompson (1952). A generalization of sampling with¬ 
out replacement from a finite universe. J. Am. Stat. Assoc., 47. 

Lahiri, D. B. (1951). A method of sample selection providing unbiased ratio 
estimates. Bull. Intern. Stat. Inst., 33. 

Narain, R. D. (1951). On sampling without replacement with varying proba¬ 
bilities. J. Ind. Soc. Agr. Stat., 3. 

Raj, D. (1954). On sampling with probabilities proportionate to size. Ganita, 5. 

-(1966). On a method of sampling with unequal probabilities. Ganitdi 

17. 

and S. H. Khamis (1958). Some remarks on sampling with replacem®^' 
Ann. Math. Stat., 29. 

Yates, F. and P. M. Grundy (1953). Selection without replacement f r0lB 
within strata with probability proportionate to size. J. Roy. Stat. Soc., B15. 



CHAPTER FOUR 


STRATIFICATION 


A 

i 

' 4.1 INTRODUCTION 

It has been seen that in simple random sampling the variance of the 
estimate (say, of the population mean Y) depends, apart from the sample 
size, on the variability of the character y in the population. If the popula¬ 
tion is very heterogeneous and considerations of cost limit the size of the 
sample, it may be found impossible to get a sufficiently precise estimate by 
taking a simple random sample from the entire population. And popula¬ 
tions encountered in practice are generally very heterogeneous. In sur¬ 
veys of manufacturing establishments, for example, it can be found that 
some establishments are very large, that is, they employ 1,000 or more 
;■ persons, but there are many others which have only two or three persons 
on their rolls. Any estimate made from a direct random sample taken 
from the totality of such establishments would be subject to exceedingly 
large sampling fluctuations. But suppose it is possible to divide this 
population into parts (or strata) on the basis of, say, employment, thereby 
separating the very large ones, the medium-sized ones, and the smaller 


SAMPLING 


THEo,jy 


ones If a random sample of establishments is now taken from each 
stratum'it should be possible to make a better estimate of the strata 
!£££ which in turn should help m producing a better estimate of th e 
Zufato average. Similarly, if a sample is selected with probability 
nroportionate to i from the entire population, the variance of the popuk. 
tion-total estimate may be very high because the ratio of y to x varies 
considerably over the population. If a way can be found of subdividing 
the population so that the variation of the ratio of y to i is considerably 
reduced within the subdivisions (or strata); a better estimate of the popula¬ 
tion total can be made. This is the basic consideration involved i n the 
use of stratification for improving the precision of estimation. There are, 
however, other considerations. For example, it is advisable to treat 
certain parts of the population as strata if estimates are wanted separately 
for them. If the main purpose of stratification is to achieve higher pre¬ 
cision, a number of questions arise for which answers must be found. 
How should the strata be made and how many of them should be made? 
How should the total sample be allocated to the strata? How should 
data be analyzed (estimates made and their variances calculated) from a 
stratified design? We shall answer these questions in inverse order. 


4.2 ESTIMATION IN STRATIFIED SAMPLING 

We shall begin by proving the following basic result. 

Theorem 4.1 

Let a population of N units be divided up into L strata , the hth stratum 
containing Nh units with a total of Y h for the character y. Within each 
stratum a probability sample is selected , sampling in one stratum being inde¬ 
pendent of that within another. Let f'k be an unbiased estimate of Y a, based 
on a sample of size n* taken from the stratum. Further, let fr(Pr>) be an 
unbiased estimate of V(¥ h ). Then 

? = F (?) = 2F(? a ) A sf(? A ) (4-D 

proof We have E(f h ) = Y h . By Theorem 1.1 then, 

E(f) = XE(? h ) = 27a = Y 

Since sampling in one stratum is independent of sampling in another, the 

random variables t k (h = 1, 2 -- L) are mutually independent 

Hence by Theorem 1.5 we have V(f') = Bv the same arg u 

meat, ££?<?.) = 2F(?„). V '' * 

Theorem 4.1 states that the population-total estimate is the sum 



STRATIFICATION 


63 

the estimates of individual strata totals. The variances add up, and so do 
the estimates of variances. Thus no new principles are involved in 
analyzing the data provided that the estimation problem can be solved 
within a stratum. But we have already studied in Chap. 3 how estimates 
are to be made within a stratum. Thus we have proved the following 

results. 

Theorem 4.2 

In wtr simple random samples of size nh, 2?u = n, within strata, an unbiased 
estimate of the population total is provided by 

t = 2 N h y h 

with a variance of 

V( t) = y N h * — (1 -fk)Sy 
and Y) = y Nh 2 - s y h 2 

W Uh 

where f h = n h /N h is the sampling fraction in the hth stratum, the variance 
being S yh 2 = 2 (Y ih - Y h ) 2 /(N h - 1) and 

Sy h 2 = ——- S iVih - yh ) 2 
n h — 1 , 


Corollary 

The population mean Y = Y/N is estimated by Y/N. 


Remark If the units within strata are similar with respect to y, the 
strata variances S yh 2 are small, which will produce a small value of V{Y). 


Remark The chance that a unit belonging to stratum h will be selected 
in the sample would be n„/N h . The chance that two unite belonging to 
the same stratum will both be in the sample is n h (nh ^the* chance 

If the two units belong to different strata (say. s ™ di j erent from 

of their joint appearance is nhUk/[NhNk]- " 

taking a direct random sample from the entire popu a 


Theorem 4.3 . , 

In wtr simple random samples of site ns, X* = * 
estimate of the population proportion P is given y 

P = ZNhPh/N - YrWhVh 


SAMPL,Na X, 


with a variance of 


and 


V(P) 

HP) 




where p h is the sample proportion in the hth stratum, and W h is the stratu 
weight. 

Corollary 

If Nh/ {Nh — 1) can be taken as unity, we get 


V(p) i 2 W, 


* ! ~~ ft(l - ft) 

Uh 


Remark The variance of P depends on the product of P h and 1 - p h 
within strata. The product is small if P h is near to zero or to unity. 
Thus higher precision will be achieved if the strata can be formed so 
that units belonging to the given class (for which the proportion is sought) 
can be allocated to the same stratum. 

Theorem 4.4 

n- ,: n - ^rata are selected with replacement with probabilities 

tion total woul/be * Character x > an un ^sed estimate of the populc- 


= V Jl ^Vjh _ y 1 _ 

4 «* p» 4 »* = 4 lh 


and V(f) = 


Corollary 

«^tedto'be a s m y aU rOPOrti0nal *° ** within strata - the variance of t 

some units would blverTm ri ° r knowiedge that the ratio of y to % 

be segregated and allocaf^ ♦ dlfferent f rom the rest, such units shoul 

located to a separate stratum. 





STRATIFICATION 


65 


4.3 ALLOCATION OF SAMPLE TO STRATA 


IiU b^lwVed nTw th Thrn le T be f C0Uld best be dete ™'“d 
wil be answered now. The principle, of course, is that the sample be 

• A | | . l . . ( ^ a given cost of the survey the 

variance of the estimate be a minimum (Sec. 2.13). Let c* be the cost of 

collecting information from a unit in stratum h. (These costs can differ 
substantially between strata. For example; information from large manu¬ 
facturing establishments can be obtained cheaply if we mail them a 
questionnaire, whereas small establishments may have to be visited 
personally in order to get acceptable data.) Let 


C = c 0 + 2 c h n h (4.2) 

be the total cost of the survey. The variance of f will be of the form 
(Stuart, 1954) 

V(?) = 2 A h /n h (4.3) 

where the component independent of nh is ignored, since it is not relevant 
to the problem of determining the best n*. Now, by Cauchy’s inequality 
(Sai 2 )(26» 2 ) > (2a<6,) 2 , we have 


(EAh/nh) (^fChUh) > (2 \ // A h Ch) 2 

there being equality if and only if 6 t - is proportional to a,-, that is, 


or 


ChUh 

Ah/nh 


= const 




Thus the product of the variance V( 1?) and the cost C is a minimum when 
relation (4.4) is satisfied. This amounts to minimizing 7(f) for a fixed 
C or vice versa. From this it follows that nh will be small if the cost of 
collecting information from stratum h is large. An application of (4.4) to 
some stratified sampling schemes will now be made. 


4.4 ALLOCATION IN SIMPLE RANDOM SAMPLING 


In case sampling within strata is simple random, we find from Theorem 4.2 
that A h = N h 2 S„h 2 . By (4.4) the best sample sizes within strata are given 

by 



SAMPLING THEORY 


€f 

This means that the sample size in a stratum should be larger if the 
stratum contains more units (2V* is higher), or is more vana e or y (S yh 
is bigger), or cheaper to investigate (c* is smaller). This allocation of the 
total sample size to strata is called optimum or minimum-variance alloca¬ 
tion and is due to Neyman (1934). This is appropriate when the cost 
function given by (4.2) holds. Substituting in (4.2) the value of n h from 

(4.5) we get 

C - co = nXN h S vh V^WV^)- (4.6) 

which gives the value of n when the total cost is fixed. Equations (4.5) 
and (4.6) will provide the best values of n*, for which V(Y) would be a 
minimum for fixed cost. In case the object is to minimize the cost of the 
survey for a specified value of 7(f), namely, when 


7(f) = 22V* 2 S V h 2 /rih - 2N h S vh * = 7„ (4.7) 

the value of n will be calculated from (4.7) instead of from (4.6). In case 
Ch = c, it is easy to see that the minimum variance would be given by 

^ NAk) 1 - J tfwV (4.8) 

In order to achieve this minimum variance the standard deviations S V h in 
formula (4.5) should be known or stable estimates of them from previous 
surveys on the population should be available. When information on 
strata variances is not available, one may decide to use the allocation 

n* = nNh/N (4.9) 

which will be called 2V-proportional allocation. With this allocation the 
variance of f in wtr simple random sampling within strata would be 

VvTop = n (* ~ w) X NhS v h * = ~J~ X WkSyfc 2 ( 4 - 10 ) 

% 

The corresponding results for wr simple random sampling arei 

= K2 **»)’ = - y iw (4.1D 

In this case, it is possible to show that V„„ p is smaller than IV V/«. w hich 
is the variance for a wr simple random sample taken from the entire 
population (without stratifying it). The proof follows from the observe- 





STRATIFICATION 


§7 


tion that 


-Y h +Y h - YY 


h i 


=- y nhv.s +- y n,(y„ - ?y 

n n ^ 

N v> 

> — / Nh<r V h 2 = V prop 
n L* 


This shows that proportional allocation will be very beneficial if the strata 
averages Y\ differ considerably from each other. If the strata made are 
such that their means are about the same, stratification (along with 
proportional allocation) will bring about only slight reduction in the 
variance. Another advantage of proportional allocation is that the esti¬ 
mator f assumes the simple form f' = N ^ S Vih/n, which does not 

h i 

require the use of strata weights. Such an estimator is said to be self 
weighting. 


Further reading See Exercise 23, in which it is shown that moderate 
departures of the actual allocation from the optimum do not lead to any 
appreciable increase in the variance. 


4.4.1 X-PROPORTIONAL ALLOCATION 

In case measures of size X& are available for all units in the population, 
the sample sizes nh may be found as a proportion of X* (aggregate measure 
of size of stratum h ) rather than of Nh- This allocation will be called 
X-proportional. In this case 


n h = 


nXh 

X 


(4.12) 


and the variance of ?, for wr simple random sampling, will be 


E X-prop — 


= N y Nh<r vh 2 

~ n £ X h /X 


(4.13) 


(4.14) 


whereas N-proportional allocation will give a variance of 

Vn- prop — / Nhpyh 1 

71 

Consider now skew populations in which a small proportion of the units 
Recounts for a large proportion of the total (of y). Examples are employ¬ 
ee in manufacturing industries and income of individuals. The stratum 



SAMPLING THEORY 

IS 

eoataining the very Urge unite will be found to be many times more 
variable than other strata. In the case of ^-proportional allocation the 
contribution of this stratum to the total variance will be very considerable, 
as is evident from formuU (4.14). But since its average M will be many 
times greater than the general average X, the factor Xk/X m the denomi. 
nator of (4.13) will exert a damping effect on the variance if X-proportional 
allocation is used. This analysis shows how important it is to use X-pro- 
portional and not tf-proportional allocation when the problem is to esti¬ 
mate totals or means of populations which are very skew in nature. 

Further reading For a skew population it may be considered desirable 
to take the m largest units into the sample with certainty and select a 
sample from the rest. How to determine the point of cutoff (beyond 
which to include all units with certainty) is discussed in Exercise 98. 


4.4.2 ESTIMATION OF PROPORTIONS 

Before concluding this subject, some comments will be made on the 
problem of sample allocation when the object is to estimate a population 
proportion P. By Theorem 4.3, we have for wtr simple random sampling 

v(p) - y bv ^ p.d - p k ) 

** rih 

With JV-proportional allocation, 7*. prop = ^ J W ^ Ph ^ ~ Pk ^ m 

If th e optimum a llocation can be used, n* will be chosen proportional to 
X* y/Pk{, 1 Ph)- This allocation will differ substantially from propor¬ 
tional allocation only if the quantities \/Ph(l — P k ) differ considerably 
from stratum to stratum. For example, let the P h lie between 0.3 and 
0.7, in which case P*( 1 - P k ) will lie between 0.46 and 0.50. In this 
situation the optimum allocation will not be preferred to proportional 

allocation when the simplicity of the computations involved is another 
factor to be taken into account. 


4.5 ALLOCATION IN UNEQUAL PROBABILITY SAMPLING 

iaswt sss ft a—: sw-“ 


(4.15) 


4 < 


l 






STRATIFICATION 


where R, = IV**-' The formula for the optimum allocation of the total 
sample size to the strata is apparent, although difficult to apply. On the 
other hand, the so called X-proportional allocation namely, n* = nXh/X 
is easy to handle. In this case (Raj, 1963) the variance of t would be 


Fx. Pr o P = ^yy^-^yi!t! = 


n 44 z* 


n4 X, 


Vr 


(4.16) 


If no stratification be made and a wr pps sample of n units is taken 
from the unstratified population, we have 


Hence 


—- £ n—--- f * 

n T 7 x * n 


Vi= V 1 + f 2 Xk(Rh ~ 


(4.17) 


(4.18) 


which shows that stratification with ^-proportional allocation will be 
always superior to unstratified sampling. But the gains from strati ca¬ 
tion will be considerable only when the strata ratios fl. differ considerab y 

from each other. 

Further reading For a comparison of the unstratified pps sampling 
scheme with stratified simple random sampling under a certain m , 

see Exercise 27. 


FORMATION OF STRATA 

ie question o, how strata are.m^ now 

rlier, the basic consideration 1 ”''° Qus p 0 r example, if units are 

e strata should be internally stra ta variances for the 

be selected at random from within t sih le. This can be 

aracter under estimation should the same stratum 

hieved by allocating units believed to be judgment can be brought 
ms all prior knowledge, personal intmtm , Tf[e ideal situation is 

o nlav to bring about similarity within the strat a wou ld 

at i^whkh the distribution of y ie avadable^ ^ absence 

created by cutting tIus ‘ s m be made for the dist " “ corre lated 
this information, a search win ^ ^ characte r * ° of y 

tained at a recent census o blem when the dis btaine d 

th We are gomg Ration and the results 

known. Although it is not a pr 


70 


SAMPLING 


T HEO* y 


could not be used directly as such, the discussion is given in the hop e tv 
it will offer some guidance in practice. Let the distribution of y b e 
tinuous with the density function/^), a < y < b. In order to make £ 


strata, the range of y is to be cut up at points y\ < y 2 < - . . ^ 

The relative frequency W h , the mean M h , and the variance a* 2 0 f the A k 
stratum are given by a 


n-fZm* 

WhMh = f VH tf(t ) dt 

JVh-i 

Wm‘ = /"* m) dt - WMS 

JVh-1 


(4.19) 


The population mean is M = 2,WhM h) and its estimate obtained from 
stratified random sample is & = 2 Whj/h with a variance of 


2 

V(&) = y W h * — 
^ n h 


(4.20) 


Obviously, V(M) is a function of the strata boundaries y h . y 
The problem is to determine the best values of y % for which F(ifr) becomes 
a minimum for a given allocation of the n h (Dalenius, 1950). 


4.6.1 PROPORTIONAL ALLOCATION 
If n h = nW h) we have 


V(M) = - y W h a h 2 


n 


(4.21) 


To determine the best values of y h , we differentiate XW h <T h 2 with respect 

to y h and equate the expression thus obtained to zero. The expression in 
i W&tf involving y h is 


a = — 

Wk W, 


fc+l 


Now 


dW h 


dy h 


w *+■ = - fbh) ~ (W h M h y = 2WkM\ykJ(yd 




a " d 9y t (Tr *+-«'‘+i) 2 - -2 W M M H+iyh f(y h ) 

Hence, dA/3y h = 0 gives 


Is 


y 




f 


L 


t 


V> = HW, + M Hl ) (4.22) 

8 ° WS ^ the best Vh is the average of the two strata means whie>> 





STRATIFICATION 


71 


it separates. The points y h {h = 1 2 
found by iterative procedures, since the' 


•••,£- 1) will have to be 
M h depend on y h . 


Further reading For an extension of stratification 
dimensions, see Exercise 29. 


problems, to two 


4.6.2 EQUAL ALLOCATION 

This is a situation of considerable practical interest. For reasons of 
administrative convenience or otherwise, it is often found desirable to 
take the same sample size from each stratum. In this case the best y h 
are to be obtained by minimizing 

= n 2 Wh2(Th2 (4.23) 

i i * B respect to y h and equating the 

resulting expression to zero, it will be found that the best points of' 
stratification are those for which 

+ ( yh - M h y\ - W h+1 [a\ +l + ( y h - M h+1 ) 2 ] = 0 

h = 1, 2, . . . , L - 1 (4.24) 

Iterative procedures will have to be used to get these points. 


4.6.3 OPTIMUM ALLOCATION 
In this allocation 


nW h <r h 

h<Jh 


v(&) = ~ (2 


Then the best values of y h will be obtained by minimizing 

B = W h <Jh 4- ^ + x<r ft+1 for variations in y h 
From (4.19), we have . 


— (WW) = n* — W* + 2 W h n — 
d’A dyh dy h 


— fiVhW + 2Whffh ~ — 

Syh 

= (&»)->/(»») l(ys - M h y - n *l 
oyh. 


(4.25) 


a nd hence 


72 


SAMPLING THEORY 


Then ± (W m ) = « £ ** + (£ «) ** 

= ( 2 < r fc )“ 1 /( 2 /*)[( 2 /^ - #»)* + ^*1 

Similarly —- (JF*+i<r fc+ i) = (2 <t* + i) -1 /(?m)[(2M ~ M h +i) 2 + <rl +x ] 
dyh 

Hence dB/dy h = 0 gives the equations 

(Vh — Mk)* + <Th 2 _ (yh — M-h+ 1) 2 + a 2 h+ i _ 0 h — 12 I/ — 1 

<Th+l 

(4.26) 

These equations for the y h are difficult to solve, since the quantities M h) 
vh 2 themselves depend upon y^ For this reason, approximate rules have 
been found by which V(M) in (4.25) can be directly made small. Probably 
the best approximate solution has been given by Dalenius and Hodges 
(1959). The basic argument used by them is that the distribution of y 
within strata can be assumed to be rectangular if the number of strata is 
large. In that case 


- (yh - yh-i)fh <Th = 




so that V12 zWm = 2 U Vi - y M )‘ = 2 [Vf k (y h - 

Defining G(h) = j““ Vf(fj dt 

we have <?(*) - G(h — 1 ) = £ VW) * ± Vf„ - y^) 

Hence Vl2 SUV* = 2[(?(/i) - G(h - 1)]* 

whenGW T g(H - /) instant ^ * * minimUm 

. i l. . /. con stant. This means that the points y h are 

be obtained by taking equal intervals on the eumulatives of V%). 

“(l^rfn whit* is PP rZmmradedV Ug8e f ed * ^ °“ by 

width y, - y h _ u There is another derdee b , make strata of e 1 ual 

~ *-0 - made eonstent CoehTan 

number of actual population n that i ( , 61 ) used these rules on a 
by Dalenius and Hodges (1959) worked bLt^ f ° Und ^ the rule ^ 


4.6.4 STRATA OF EQUAL AGGREGATE SIZE 
Another rule widely used in practice i« to i 

same aggregate size W h M h Tb,« w i • make strata which have the 

18 FUle 18 considered to be partieularly 


STRATIFICATION 


73 


{ u seful when a constant sample number is to be taken from each stratum. 
The suggestion came originally from Mahalanobis (1952). Raj (1964) 
tested this rule on four theoretical distributions to find out how it com¬ 
pared with the optimum (Sec. 4.6.2) for constant sample numbers within 
strata. The distributions considered were : 

(2/ir)* 4 exp (- 2 / 72 ) y > 0 exp (- y ) y > 0 

y exp ( — 2/) y > 0 2(1 — y) 0 < y < 1 

For these distributions the rule was found to give poorer results as the 
number of strata increased. It was not optimum or near-optimum for 
large L. An explanation found was that the lowest stratum created by 
? this rule was highly variable. 

Further reading See Exercise 25 for an illustration. 


4.7 THE NUMBER OF STRATA 

Tho ouestion that now remains to be answered relates to the number of 
rr : ve It has already been noted in Secs. 4.4 and 4.5 that 
l ocation along with proportionate allocation always produces a smaller 

further subdivision of the strata. ’ , stratum, thus making 

the point that only one unit is selected ^ gelected And> 

the number of strata as large as the numb “ of “ ^ tipKcity of strata 

provided that self-weighting f imat °lXti 0 n^”ccmcern^. This is 
will cause no inconvenience so far as c use 0 f a j arge number of 

one reason that the survey sta . tls !°‘ bat beyon d a reasonable number of 
strata. But it is also recognized tha , Y . about a propor- 

strata, doubling the number (of s ra a]L rat ifi ca tion for y is made on the 
tionate reduction in the variance w en may b e demonstrated with 

basis of another character x. How is j ar var iate in the range 

the help of a simple model. Let x be a r uncorrelated. Then 

to a. Further, let y = * + * - here . formed. Then 

»,» = + ff ... Let g strata of equal width 




<Txh‘ 


Wh * 


Nh 


1 

g 


12 g 1 N 

, . he estimate of the popu- 

sample allocation is equal, the variance o 
1 When selection is made with replacem 


SAMPLING 


lation mean will be 


theory 


- = — + — 
n 12ng> * n 


wilUe 11111 " 116 " Stlata be increased t0 x 9, the 


variapce based on \g strata 


I_?L 4.11 

X 2 12 ng* n 


" * —/C, 

w of r :ir b p :rr:?ctdTr: t ha T h ecrmed by ~ ing 

when the second component begins t a & P °' nt WlU S00n be reacl >cd 
further increase in the numberTf t t the VariaMe “d any 

worthwhile gain in precision ^ * * 

M SOME PRACTICAL SITUATIONS 

no“Llr tlCal Pr ° blemS ~«ed - Gratified samphng will 

4-8.1 THE METHOD OF COLLAPSED STRATA 

IZtlCZlZ stratifica- 

STto ?5L 

F ( ? ) = 2n? k ) 

rigorous estimate ofVT?) 0 * 6 But m a Stratu “? i4; is not Possible to make a 
collapsed) into pairs and we calculate^ ^ ^ Str&ta are grou P ed ( or 


LA 2 

h = 2 


^J*) ! 


where p 

tth pair.’” It IsTasy'ITSthS ° f * he t0tak ° f the two •*>»*» forming t 

? I(F/1 ~ K,!)2 + + V(? n )] 

“ V(?) + ? ^ ~ ^ 




stratification 


75 


* 


This shows that the quantity b, used as an ^estimate of the variance, will 
overstate the true variance of the overstatement depending upon the 
extent to which strata forming the same pair differ with respect to their 
totals. If the pairing could be so arranged (before the collection of data) 
that the strata forming the pair are about equal in size (total of y ), the 
overstatement will not be serious. It may be remarked that by sub¬ 
stituting 1 /Nh for pjh, we obtain results appropriate to simple random 
sampling. 


4.8.2 ESTIMATION OF GAIN DUE TO STRATIFICATION 

It is of interest to determine from a survey, carried out according to a 
particular stratification, how useful the mode of stratification has been. 
A comparison can be made with the situation in which no stratification is 
used by estimating from the stratified sample the variance of the estimate 
in case of unstratified sampling. If sampling within strata is with proba¬ 
bility proportional to x, we have for wr sampling 

f -1 = l = XfX * (| - 

and 

v(t) s y 1 - s(- - = y i4.28) 

n h (n h - 1) j \Pjh / u 

If a pps sample of size n = 2n/, is selected directly from the entire popula¬ 
tion without using any stratification, the variance of ? will be 


Vi = V(t 


* i ' 


- rX 
v* ) 

-fxx*(g-*+*^y 


(4.29) 


where R = 2 Yh/'ZXh. An unbiased estimate of the first term in (4.29) 
can be immediately made from (4.28). In order to estimate the second 
part, we try 2 X h (Rh — R ) 2 , where Rh = Vh/Xh, R = 2f' fc /X . Wehave 


E £ X h (R h -£)»-£ X h (R h - R)* + £ EkV(&k - R) 


Hence 


^ X,(R, - R) 


= l x ^- R )'+l(Yr^) nth) 

! -X- X (i - h) nu 



SAMPLING 


7S 


theory 


Thus an unbiased estimate of Vi is provided by 




+ f 2 Xh{Rh - ^ (4,30) 


Equations (4.28) and (4.30) provide a comparison between the two 
variances based on the stratified sample. 


4.8.3 DEPENDENT SELECTION 

In all applications considered so far, selection within one stratum is inde¬ 
pendent of that within another. There may be situations, however, in 
which it is considered desirable that certain combinations of units (belong¬ 
ing to different strata) be given a higher probability of selection at the 
expense of other combinations. This procedure is called controlled selec¬ 
tion (Goodman and Kish, 1950). Thus the selection is not made inde¬ 
pendently but in a dependent manner. As an example, let there be two 
strata; a sample of one unit is to be selected from each stratum with 
probabilities Pi (i = 1, 2, ... , Ni) and Pj (j = 1, 2, ... , # 2 ) f rom 
the two strata, respectively. Let Pij be the joint probability of selecting 
unit Ui from the first stratum and unit Uj from the second. An actual 
example is provided in Table 4.1, in which the probabilities Pij (to be 
obtained by dividing by 100) are given in the body of the table. The 

Table 4.1 Probabilities of selecting 
two units, one from each stratum 


Stratum 1 



A B C D E F 

Pi 

a 

15 

15 

Stratum ^ 

10 20 

30 

9 ‘ 

10 

10 

d 

20 5 

25 

e 

20 

20 

Vi 

15 10 20 10 20 25 

100 


the. ROnpreferred a ™es 5 hlve'a Wer h ( Ighei ti Pr0babilitieS ° f selection ’ a 
Yet, the proBabtlittes pi Jlf Z6ro) Chattce of selecti ' 

estimate of the total of-the two strata'^ 8 ”* pre8erved ' An unbia£ 


f = V< + «i 

P< p, 


STRATIFICATION 

But the variance of f is given by 


77 

f 




4.8.4 ESTIMATING SEVERAL PARAMETRIC FUNCTIONS 

Instead of estimating just one function Y = S NkYh of the strata means, 
we may be interested in providing estimates for several of them. For 
example, we may want to estimate the area under food crops not only for 
the province as a whole but also for a particular group of strata within 
the province. For each linear function there is one best allocation of the 
total sample size, and these allocations may not agree. The problem to 
be considered is how to arrive at a single allocation of the total sample to 
be used in the survey. The same problem arises when there are a number 
of items to be estimated from the same survey, but only one allocation can 
be used. The answer to these problems will depend on (Raj, 1957) what 
the sample is supposed to achieve. We shall consider some approaches 
to the problem. 

Minimization of cost plus loss If the results obtained from the sample 
are going to form the basis of some practical action, we may be able to 
calculate in monetary terms the loss that will be incurred in a decision 
through an error of amount d in the estimate. For example, if this loss be 
pd 2 (Yates, 1949), and the estimate is unbiased, the average loss in a 
series of samples of the same ty ye and size will be mV(L) for the ith 
linear function Li = X The purpose in taking the sample may be 

h 

to diminish the sum of the total expected loss 

L = 2/J.-Y(£») — EpiZWSh 2 — 


d the total cost <7 = 2 c h n h ', where the cost function used is more 

leral than the usual one. Thus the best values of the m can be found 

calculus methods. (See E cise 21, w ere 1S , nroblem, we 
Minimization of cost As an alternative approac^c functions L . are 
iy consider the survey to be useful if P ^ ^ |he tola i samp le 
inflated with some desired variances ». that the cost 

_ . . I The P iroblem then reduces to the 

the survey is made a minimum. P 


SAMPLING 


theory 


minimization of 

/(ft,-, ft2, • • • > w *) ~ 

subject to the conditions 

0 , — V (Li) — 

Minimization of variances Another type of requirement may be that 
the relative precisions of the different estimates be in some assigned ratios 
As a particular case, it may be desired that the coefficients of variation of 
different estimates be all equal, the common value being necessaril 
dependent on the cost of the survey. In such a case the variance of ^ 
of the estimates will be minimized for fixed cost and for stipulated rT^ 
tions between the variances. 

If the relative precisions of the estimates are governed by 
V(L) = a 2 V(U) --=•••= a r V(t r ) 

obtained ^^calculus methodZ ^ ^ ValU6S ° f the % Can be 

e ?it:'r ple a,location to strata when ^ 

estimated from the survey is considered in Exercise 30. 

« the stratum of nonresponoents 

Another problem tho Vi 

chapter, will be presented now related to the contents of this 

[ s Ration is facilitated bv a e i reason for its inclusion here is that 
0 e divided up into strata 1 ^ a ^ a P°Pulation can be considered 

wav the A Stmta and their s i Zes ? lf haVe no lists of the units falling 
to ept • J an(lom sa mple of units i 6 ^ nknown - The problem arises this 
from a n ° r J nation °u some items SC f e . Cted and questionnaires are mailed 

onThe r e r be i° funits - ftis2? No response is received 

theresnor 1 H 0ndentS alone > since the t0 ^ ase the results of the survey 
-erC^ may be different from 

assuming that nonre spondent ’ 1 ^ SVer y expensive to hold a personal 
fi * st is made un * e , ntire Population • j- S situati °u may be tackled by 
the conditions of* + ^ bose w ho would ^ lvided U P into two strata: the 
^°nld not. 'j'L e survey, an( j les P°nd to the questionnaire under 

! he fir st stratum ^^ents of th^ SeC ° nd is made U P of those wh ° 

obT the ^cond’ st an F the a °nr snoL SUrVey give a random sample from 
b8erved in the L o atut »- LetT a n n d f nts fo ”* a random sample taken 
tW0 ^rata. i n \ and > = n, be the sample 

r to collect information from tb e 






stratification 

i second stratum, a subsample of a convenient size u = n 2 /k is taken, and 
information is collected by personal interview from these units. The two 
samples are then pooled to get an estimate for the entire population. 
How this can be done will be evident from the following (Hansen and 
Hurwitz, 1946) theorem. 


Theorem 4.5 


Let a random sample of n units contain n\ units from the response stratum 
and n 2 from the nonresponse stratum. If information can he collected on a 
sample of u = n 2 /k units from the second stratum, an unbiased estimate of 
the population average for y is given by 


_ riiy n , + n 2 y u 
n 


(4.32) 


with a variance of 

V(&) = W&,* + (4.33) 

where W 2 , Sy 2 2 represent, respectively, the weight and variance of the second 
stratum. 


proof Given the saipple of n units, E 2 {y u ) = y nj so that 

E(J&) = - Ei(niy ni + n 2 y„ t ) = Ei(y n ) = Y 
u 


Again, given the sample of n units, the conditional variance of M would be 



where S n2 2 represents the variance for the sample of n 2 units. By Theorem 
1.8, we have 


V(&) = EiVidSr) + = £1 ThSnA + 7i(5.) 

Now, given that the sample of n gives n 2 nonrespondents, 


E(S n 2 ) = S v 2 = the variance of y 

^ the nonresponse stratum. And E(n 2 ) = nW 2 . Hence, by Theorem 
1*7, we get 


V(lft) = 



W 2 S v2 2 + 



n 



sampling, theo Rv 

80 

p mark The first term in (4.33) represents the contribution to th e 
variance due to the fact that only a fraction of the nonrespondents were 
contacted for collecting information. This term would vanish for k = \ 

Remark Let c 0 be the cost per questionnaire of mailing, Cl be the cost 
gpr questionnaire of processing the completed questionnaires, and c 2 be 
the cost (per questionnaire) both of enumerating and of processing returns 
obtained from the nonrespondents. Since E(rh) = nW i, 




m 




nW\ 

~¥ 


the expected cost of the survey will be given by 


C = c 0 n + nWiCi + 


nW 2 c 2 


(4.34) 


The problem is to choose the initial sample size n and the subsampling 
rate k sqch that the expected cost is minimized for a given value V 0 of the 
variance of M. It can be easily shown that 


and 


^2 _ CtOS,* ^2<S«2 2 ) 

^v2 2 (Co + CiW i) 


* = j W + (fc ~ l)W 2 S^} 

- NV 0 4- SJ 


(4.35) 


(4.36 


r \ 

assume that S v2 2 = S ^ ^? ri ^ ll ^ as (4-35) and (4.36) in practice, we ma 

nonresponse, i s avail»M« t at an advance estimate of W 2 , the rate « 

aiiable. In such a situation the formulas become 




K* = - -1 * 

Co + C!^ 71 = n 0 [l + (k - 1)TT 2 ] (4-* 

'Where n 0 is th 

of r. if there be r ^ uired to “hieve a varia- 

RciTlQTlt In 

Greece for estimating wages a 

»* - 1 2 <n T^ c « = 1 c, I “ ent8 ' the information collected shoa 

meats and t„ .i"" is 100, q ues ti™ = 15 ‘ In this case k = l- 5 , a 

d two ~thirds of the ^nm ^ WiU be “ailed to 120 estabh 

nonrespondents will be interviewed. 






STRATIFICATION 


4.10 LATIN SQUARE STRATIFICATION 1 


si 


Suppose there are two important criteria, A and B, of stratification such 
hat V strata can be constructed from the A criterion and within each of 
these, V from the B criterion, giving in all p’ substrata, or cells. As an 
example, the A criterion may oe altitude of locality, there being p altitude 
groups. The B criterion may be the size (population) of the locality, 
there being p size groups. It is considered very desirable to introduce this 
stratification into the survey, but the number of units that can be taken 
into the sample is small and not all the p * substrata can be sampled. 
(The problem is more relevant when the units are the psu’s of Chap. 6.) 
We shall consider the situation when just p units can be taken into the 
sample, and each substratum or cell in the population contains M units. 
In order that each altitude group and each population group be repre¬ 
sented in the sample, the Latin square design (see Table 4.2) is the 
obvious choice, the rows of the Latin square representing the population 
groups and the columns the altitude groups. The selection proceeds this 
way. Select one cell at random from the first row and delete from the 
population the p — 1 cells occurring in the column to which the selected 

Table 4.Z Latin square selection 


Cl Cl c» c« c 6 


7*1 

X 

7*2 

X 

7*1 

X 

7*4 

X 

7*5 

X 


cell belongs. From the second row select one cell at random from the 
P — 1 that remain and delete from the population the p — 1 cells that 
occur in the column to which the second selection belongs, and so on. 
This gives a sample of p cells. Within each selected cell select one unit 
at randoiri from the M. Let y tc be the observation on the unit in the 
sample cell occurring in the rth rojy and the cth column. It is easy to 
verify that the unconditional probability that a cell is selected in the 
>■ sam ple is 1/p and the unconditional probability that two specified cells 
(ft°t in the same row or column) are selected is 1 /[p(p — 1)]- The expected 

p 

v alue of Y M (the selection in the first row) is (1/p) X — G r \Jp, where 

1 = 1 

1 To be read with Sec. 9.6. 


12 SAMPLING THE0ry 

Gn refers to the total of the first row. The variance of Y u i s 2 ^. 2 / 

-.(« G«/V )*• A* 80 

E(YuY v ) = p(p Z7j r « 7 * = p (p _ i) [Yll{Gri ~ + ' ‘ •] 

GrlGri — ^ YliYii 

_ _t_ 

7>(p - 1) 

With this background, we shall prove Theorem 4.6. 


Theorem 4.6 

In the pXp Latin square design in which one unit is selected frm the M 

contained m the selected, cell, an unbiased estimate of the population total is 
given by N 


T = pM £ 2/ rc 

with a variance of 




where 





, P 

re = variance of y in (r,c) cell 


PROOF 

G ‘vfn lh e ce) |_ _ 

^ rc) ^ 2(l/rc) 

Hence 

*■">-’ 2*. v,(p). 

pw ! y 

Thus 

T 

r 

And 

^ r 

7 


BlVr «( ? ) ■= pAf* 2) X 

<r,e 2 


stratification 


SI 


Now V&(?) ~ P 2 X [ # p “ (f) J 

GrGr. - l Y rl Yr' t 


X^ 


+p j X I 

r r'r*r 


GrGr • 




r c 


p(p - l) p* 

+^=111<- ~ x l ll -ll ftft. 


But 


£ 2 aa* - -1 G ' ¥ " Y -“ = l G '~ll Y - 


Hence, 


VMt) - P £ £ >V - 2 G ’ + F^i ( r ‘" £ ft ’) 

-ihQv-n™) 

- i *,Z 2 "-Jnl v+ r-i r - ^? a ' 

- Mn >■-- f) - - C “•- 7 ) 7 C a ;; ?)] 

= —— r (pV 1 - p v,« - pW) = —7 (7 p, j 

p - I ^ 

jThis proves the theorem, which is due to Cornfield and Evans (Hansen 
et al., 1953). 

Remark If only the rows of the Latin «qu« ™ 

ceU is selected at random from each row w.thout regard to 

an unbiased estimate of the population total is given byt-P^L V~ 


To obtain its variance we note that 

By7,(t) = pM* 2 2 ViEi(?) = P F " ^ 

y VyE,(?) = P (2 I Y " V * 

= p 3 <r 2 — pov 2 

Hence 7(f) = p 8 (a* - pr) + ^ ^ 



84 


SAMPUN ° THto#, 


A comparison of the two variances shows that the two-way stratifW \ 
<■ it_ t ammre will be superior to one-way stratification if l0tl i 


* Z usTwm "» b. «.»—.y 


?l(- 


or <r c 2 > 






Remark One method of getting an unbiased estimate of the varianc 
given by (4.39) is to have m independent replications of the Latin squar! 
arrangement. Then m independent estimates of Y can be made and an 
application of Theorem 3.4 provides an estimate of variance. 

REFERENCES \ 

Ayoma, H. (1954). A study of the stratified random sampling. Ann. Inst 
Stat. Math., 6. 

Cochran, W. G. (1961). Comparison of methods for determining stratum 
boundaries. Bull. Int. Stat. Inst., 38. 

Dalenius, T. (1950). The problem of optimum stratification. Sk. Akt., 3, 4. 

-and J. L. Hodges, Jr. (1959). Minimum variance stratification. J, 

Am. Stat. Assoc., 54. 

Ekman, G. (1959). An approximation useful in univariate stratification, inn. 
Math. Stat., 30. 

Goodman, R. and L. Kish (1950). Controlled selection—a technique in proba¬ 
bility sampling. J. Am. Stat. Assoc., 45. 

Hansen, M. H. and W. N. Hurwitz (1946). The problem of nonresponse in 
sample surveys. J. Am. Stat. Assoc., 41. 

-, and W. G. Madow (1953). “Sample Survey Methods and Theory." 

John Wiley & Sons, Inc., New York. 

Mahalanobis, P. C. (1952). Some aspects of the design of sample surveys. 
Sankhya, 12 . 

Neyman, J. (1934). On the two different aspects of the representative method' 
the method of stratified sampling and the method of purposive selection. 

Roy. Stat. Soc., 97. 

Raj, D. (1957). On estimating parametric functions in stratified sample 
designs. Sankhya, 17. 

(1963). A note on stratification in unequal probability samp 
Sankhya, B 25. 

(1964). On forming strata of equal aggregate size. J • A.va. Stat. ^ 

Stuart, A. (1954). A simple presentation of optimum sampling re 8014 ®’ 

Roy. Stat. Soc., B 16. & 

S5\.* “Sampling Methods for Censuses and Surveys- 

Publishing Company, Inc., New York. A 






CHAPTER FIVE 


further use of 
information 


SUPPLEMENTARY 


5.1 INTRODUCTION 

If there is one thing that distinguishes sampling theory from general 
statistical theory, it is the degree of emphasis laid on the use of auxiliary 
information for improving the precision of estimates. We have already 
had some examples of it in earlier chapters. Auxiliary information was 
used in Chap. 4 for purposes of stratification. In Chap. 3 the probabilities 
of selection of the units were based on the measures of size provided by 
supplementary information. We shall now present some further methods 
°f making use of auxiliary information to achieve higher precision. 
Another related problem, that will be discussed in this chapter, is the 
e stimation of a population ratio R = Y / X. In the previous chapters 
attention was focused on the estimation of totals, means, and proportions. 



sampling theory 


over vne region, tne ratio ui agricultural area and the pODulftf 

the commune, which is the per capita agricultural area, would b°t° f 

van ble. If population figures x are known for each commune sav \ ** 

an earlier population census, it would be preferable to estimate the "! 

agncultural area and the census population from the sample of eeml " 

and multiply this figure by the known census populationtotal77? 
communes in the region. If a random BQT yvtv 1 q l0taI of aU the 

and 8 Xi as the totals for y and x, respectively, the total™f^w theT ^ * 

■jrwj.SK5.-5 1 sirr “ p x 

where y i 8 the total agricuitur^a^ *? * “ W 

region. If we take a random , the total Population in the 

agncultural area and the population ^ ® onununes and determine the 
census population), it is^atur! t (ex “ tn W number of persons, not the 
should be noted iLt the X, ““ * b ^ = S^* It 
connected. For estimating ^ e di ^ eren t» although they are 

c aracter x; this information need ^ aVe use< * formation on any 
tke entire universe. On the othw recent > but ma st be known for 

equired for y as well as for x (the d \ ln ^ orin ation on a sample basis is . 

to estimate the ratio ft = ^^ominator of the ratio) if the purpose 

orohl^ m either case > most of po P ulati °n- Since the theory 

of estimating a ratio ^ SUbsequent vaults will relate to the 


s 


x to estimate me miai iur y. i»uueuui-c is cauea ratio estimat' 
For example, suppose it is desired to estimate the total agricultural ^ 
in a region containing N communes. There are very big communes ^ 
very small communes and this makes the character y vary tremend ^ 
over the region. But the ratio of agricultural area and the populat' 0 ^ 
commune, which is the per capita agricultural area would k°!° f 


> 


4 


of ft and F rr h T em sh °ws that P * o 

°°dman and Hartley 195^ are usuall y biased estimators 


The 


or ®m 5.1 


In sit 


^mple ra-ndn™ 

"livens ^ theb 


4 


m) ^ ^ 


lQs °f the ratio estimator ft = Sy { /Sxi = 0 
Cov (£,*) (5.1) 




FURTHER use of supplementary information 


or 

or 


proof As Cov ($/*,*) = ft( 9 ) _ E(s/t)E(t) we have 

A A' f - Cov i\ 

E(&) = R — j Cov (&,£) 

E(R) -ft = ft(S) = - 1 cov (ft,x) 


•7 


Corollary 

Denoting the standard deviation of R by <r(R) we have 

B(R) - 

B(R) ~ <r(£) 

- -p(ft,x) = -p(ft,*)CV(i) 

s* * -<*> 


or 

Hence 


(5.2) 


where CV stands for the coefficient of variation, and p is the correlation 
coefficient. This result is extremely useful in practice when it is con¬ 
sidered important that the bias of the estimator be negligibly small in 
order that proper confidence statements (Sec. 2.11) be made. In that 
case the sample size n is to be so chosen that CV(x) = (1/n — l/N) H S x /X 
is smaller than }{o, say. 


Remark The bias associated with f = Xy/S would be XB(R). 
Remark R is unbiased if p(R,2) — 0. 


M AN APPROXIMATE EXPRESSION FOR BIAS 

The exact expression for the bias of the ratio estimate derived in Sec. 5.3 
is not always very useful. An approximation involving the coefficients of 
variation of x and y is obtained in Theorem 5.2. 

Theorem 5.2 

In simple random sampling an approximate value of the bias of fi = y/x is 
Qiven by 

B(R) = /2CV(£)fCV(i) - pCV(y)] ( 5 - 3 > 


N 


SAMPLING 


proof We have 

E(R) - R = E RS * j 

— Rx 


Th EORV 


Hence 


-<i 

B(g) = /(«) 


f) 

z) 


8x = x — X 


,X -f- 8Xj 

calculated at 0 = 1 


W ^ f(6) Rx)/(X + 68 x ). We shall find Taylor’s expansion 

of/(0) around 0 = 0. This expansion is 


m = /(o) + r (o) + + 


Now /(0) = 


E(y — fla;) 
X 


2 ! 

/'(0) = - £ 


[(y — &e)&c] 
X 2 


Hence ( E[(y — 7fcr)(fcc) 2 ] 

X X 2 T* 


X 3 


The expansion proceeds in powers of 8x, the successive terms becoming 
smaller and smaller. If only the first two terms in the expansion are 
used, an approximation for the bias is obtained as 

— Coy (y — Rx, 8x ) _ — Cov (y — Rx, x) 


B(R ) = 


X 2 

fxr(y)<r(x) — R<j 2 (x) 
X 2 


X 2 


= I2CV(z)[CV6t) - pCV(y)] 


A closer approximation can be obtained by retaining more terms. Evi¬ 
dently the expression will involve moments of the joint distribution of y 
and x. ■ 


Remark The reader may wonder why the estimator based on the simple 
average of the ratios has not been considered. An application of this 
theorem shows that under a certain sampling scheme (see Exercise 35) 
the bias of this estimator is larger than that of the estimator based on the 
ratio of the two averages. 


5.5 MEAN SQUARE ERROR 

/ 

We now come to the question of the precision of the ratio estimator, 
oince the estimator is generally biased, its mean square error around R 
would be of greater interest. 




FURTHER use of supplementary 


■NFORMATION 


89 


Theorem 5.3 


Jn wtr simple random sampling , 
of R = y/x around R = Y/X is 


an approximation to the 
given by 


wi6(iu square error 


E(R -*)«*!(i _ p V + mjf - 2pRSJS.. 

n » xi -- (5.4) 

proof We have 


e(r - r)* = e (L-_ R£ \*-. F ( y ~ R * 

\ x / \X -J- hx 


— RxV 

) &x = x - X 


Then the mean square error is the value at e = 1 of the function 


“-'(frS)' 


Developing Taylor's expansion of/(e) as in Sec 5.4, we get 


E(R — Ry = ■^'(y R^Y _ 2E[(y — Rx) 2 bx\ 


X 3 


+ 


• • • 


(5.5) 


As before, the expansion proceeds in powers of 8x, the leading term being 
the first term which is of order 1 /n. By retaining only the leading term 
in the expansion, a first approximation to the mean square error, denoted 
by Vi(R), would be 


v (R) = m R ~ 1)]2 = + R%v ® ~ 2 R p°(y)°w 

u ' X 2 X 2 

since p (x,y) = p(x,y) = p 

Hence ^ = 

n X 2 

= —l « ! [CV 2 (y) + CV ! (*) - 2pCV(y)CV(a:)l ■ 
n 

A second approximation to E(R - R) 1 can be obtained by including the 
succeeding lower order term and so on. Fortunately, it is possible to 
fi »<l bounds for the error made if only the first approximation, given by 

V >(£), be used. 


SAMPLING THEO,, 

Corollary ' 

\ 

An approximate expression for the mean square error of ^ 

be given by would 

Fl(?) = XtV '$) - ~~ OV + ®W - 2fipS A) 
y j (i - f) 

-^-[CV ! (») + CV*(j) - 2 P CV( S |)CV(*)] 

Remark Vx(R) = 0 if y is proportional to x. 

Further reading Refer to Exercise 32 for a sufficient condition that the 
first approximation to the true variance be an understatement. I 


5.6 BOUNDS ON THE M$E 


In view of the fact that the method of ratio estimation is widely used in 
sample surveys, and only an approximate formula is available by which 
its precision can be measured, it is a matter of considerable importance to 
see how far this approximation is satisfactory. This will be done in this 
section by finding bounds for the exact-remainder term in the expansion 
(5.5). The first-derivative exact-remainder term would be —2 E[(y — 
Rx) 2 8x/ (X + d'Sx) 3 ], with 0 < 0' < 1. Since its derivative with respect 
to 0' is positive, it is an increasing function of 0' and its lower and upper 
limits are (Raj, 1964a) 



—2 E[(y — Rx) z 8x] 

Y* 


and 


ft = - 2E ~ Ri)Hi 

X 3 


5 

4 




Hence we obtain 

Gi < E(R — R) 2 - Vx(R) < G 2 (5.6) 


Similarly, by stopping at the third-derivative exact-remainder term, we 
get the bounds 

A + B + ft < E(R - R) 1 - < A + B + ft (5-fl 

where 



-2 E 


(y — Rx)Hx 

Y 



-4 E 


(y - R£) 2 (8£)* 


(y - Rx)* (8xy 
X 4 


B = ZE 


_ a - 
Ci - iE w 








further use OF SUPPLEMENTARY information 

v We shall now assume that the relationship 
liae through the origin. This is the situation in whil^ * is a 8trai * h ‘ 
is going to prove more useful. Theorem 5 4 ™“, e L ratio Ornate 

mean square error in this case. tne b °unds on the 


Theorem 5.4 

Under the model 


y = Rx + e E( e \x) = 0 


and assuming zero correlation between (e) 2 and each of Sx (Sf)* 
(6x) 3 /(x) 6 , the bounds on MSE (R) are given by ’ ' ' ’ 


r 


F MSE (R) - FiCg) 

Vi(R) 


<G 


(5.8) 


where 


F = 3CV 2 (x) 




G = 3CV 2 (f) - 4X 2 E 


(sty 

(*) 6 


proof Using the result that E(JJW) = E(U)E(W) if the random 
variables U and W are uncorrelated, it follows immediately from the 
assumptions made that 

A = 0 B = 3V(e)V(x)/X 4 = 3CV 2 (aj)Fi(£) 

- (Sx) 3 

C 1 = -4 V^EiSx/XY C2 = -4Z 2 Fi(#)£ — 

^Making these substitutions in (5.7) we get (5.8). ■ 

Remark If the distribution of x is symmetrical, E(8x) 3 = 0. In this 
illation the use of Vi(R) results in understating the true mean square 
^ror. The amount of understatement, as a proportion o i( * 
3Cv ’(*). The understatement is nigher if the third moment of x is 

Native. 


CO *PARISON WITH THE SIMPLE AVERAGE 

• . *11 Up Giinerior to tiie 

® l rcumstances under which the ratio estimate variance 

average (the sample mean) will now be pointed out. 

" in wtr simple random sampling is given 

vet) =^ 2 (i -/) 


Sv* 


92 


SAMPLINq 


theory 


In this case no use is made of the auxiliary information provided k 
If this information is used to form the ratio estimate ? = x& ^ *' 
approximation (which sometimes understates) to the mean J a first 
around Y has been found to be q are err °r 


Vi(^) = JV 2 (1 — /) ^ y2 ~ 2RpS x S y ) 

n 

Judging by this approximation, the ratio method will give a more nrwia, 
result whenever e 


2 p > 


RS X 

S v 


or 


CVfr) 

9 2CV (y) 



Thus the issue depends by and large on the strength of correlation between 
y and If x is the same character as y, but has been measured on a 
previous occasion, the coefficients of variation may be taken as equal. 
In that case it pays to use the ratio method of estimation if p exceeds }/%. 
But one should not be dogmatic about inequality (5.9), since it is based 
on an approximation. In fact it can be proved that even in the presence 
of perfect correlation between y and x, the ratio estimate may not be as 
good as the simple average. This is evident from the following (Raj, 
1954) theorem. 


Theorem 5.5 

Let there be perfect correlation between y and x, so that y = a + bx. In wtr 
simple random sampling, the estimator Ny mil be superior to the ratio 

estimator XR if i 

x>va/x) v W ( 5.io) 

S* 2 a 2 n 


The proof follows immediately from the observation that, in vie* 
of the linear relationship, V(y/x) = o 2 F(l/#)- 


Remark For very large values of a, inequality (5.10) may ® e , 
satisfied. A large value of a means that the regression line passes ^ ^ 
a point far from the origin. Then near-proportionality between y ^ 
does not exist. In such a case it may be futile to make the ratio es 
with x in the denominator. i 


Remark It would be a sound practice to examine the relatio 
between y and x on the basis of past surveys and use this informa 
the future. 






FURTHER USE of supplementary information 


93 


5 >8 SAMPLE ESTIMATE OF MSE 

Denoting the random^variable y — Rx by U, the approximation to the 
mean square error of R may be written as 

7,(B) = ~w = - 


n 


where 


S u 2 = 


N 

i 

i = 1 


l (Vi - uy 


N - 1 

It is thus natural to estimate Vi(R ) from the sample by 

~ ^ (1 - f)s u 2 
Vi(R) = J 


nx 2 


where 


Su 2 = 


S[yi - y - R(xj - a )] 2 
n — 1 


(5.11) 


The estimator (5.11) is biased, since it is a ratio estimator. As R = y/x, 
s u 2 can also be written as 

Su 2 = ( n _ - Ex t ) 2 = (n - l)" 1 ^ 2 + R 2 SxS - 2RSx t y t ) 


5.8 UNBIASED RATIO ESTIMATION 

In view of the fact that, under simple random sampling, the ratio esti¬ 
mator Xy/x is biased, one line of recent research has been to modify the 
sampling procedure so that the same estimator becomes unbiased. This 
will retain the simplicity of the estimator and make it unbiased at the 
same time. The procedure (Lahiri, 1951; Midzuno, 1950) consists in 
selecting the sample with probability proportionate to its aggregate size 
(ppas). This is best done by selecting the first unit in the sample with pp 
to x and the other (n - 1) units with equal probabilities without replace¬ 
ment. The proof is given in the following Theorem 5.6. 


Theorem 5.6 




is from a finite population of me N, the first unit in the sample is se 
*ith probabilities l « = 1. . . • ,*).** = 

*nits with equal probabilities without replaeeuient, the probability of selectm 
a particular sample s is given by 



SAMPLING 


theory 


““ S^Uly the probability that U, is the fimt selection and the 

other units form a simple random sample is Pi /N , and so on. Hence the 
probability of selecting the sample s in this manner is 


Sjpi 

V' 


Sxj 

AT' Y 


Corollary 

The expected value of f = XSyi/Sxi is given by 


B(t) = | 


„ Syj Sxj 

Sxi XN' 



Y 


where £ denotes summation over all possible samples 8. 
ratio estimator becomes unbiased in this scheme. 


Thus the usual 


5.10 THE VARIANCE OF THE UNBIASED RATIO ESTIMATOR 

fhem^fiTZ^r f ° r TarianCe ° f the unbiased ratio estimator under 
(iTnut, r * C ° Uld be Written do ™ principles 


V(t) - (N-'r'X l «g£! _ yi . v 

Sxi 1 


ppas 


(5.1 

i/ • 

that n?tvaShTXn°u Samples *• 14 is obyic 

revealed by this expression aj s Lr >P ? rtl ° nate to *• Nothin S more 
the sample will now be obtained*' A “ Unbiased estimate of V(f) In 

Theorem 5.7 

h PPM, an unbiaud 

estimator of Y 2 is given by 

K! 3 S!<VWl+2_ 

where ff" „ TN - 2 \ 

° Mr di Sfrent"pairtl L/^ ** Sx,/XN \ and S' denote mm*** 

n the sample. 






FURTHER USE of supplementary information 
proof From, first principles, we have 

% 


E iJw 

N'Pr(s ) 4 N' Z/ 


and this completes the proof. 


Corollary 

An unbiased estimator of V(9) is given by 

F(f) = P - G since E(P - G) = E(P) — Y 2 = V(t) 


Remark The estimator of variance may assume negative values for 
some samples. 


5.11 RELATIVE PRECISION OF THE UNBIASED RATIO ESTIMATOR 

A comparison of Vp paM with the variance of some other estimators will 
now be made. The discussion will be restricted to samples of two units 
only. (This is appropriate when the units are psu’s of Chap. 6 and only 
two units are selected from a stratum.) In this case the variance of the 

unbiased ratio estimator is 


r_-xf<r, + r /» + »»: n - -3 




vhere Y denotes summation over aU different pairs in the population. 

lit will be noted that the contribution of a number of P“ s °* 
r*. would be negative.) We shall assume that the N units -e'hvided 
■P into a number of arrays with * being the measure of size for all 
^ units in the ith array. Further, we assume the model 


Him ~~ €im 


where £ e,„ = 0, J «*.* = aN<x/. Thus for a given *, 
Ruined to have a mean value of zero and a variance 


the residuals are 
proportionate to 


SAMPLING 


96 


Th EOr y 


X 


= [2(JV - l)] _1 a^ 


> 0) . Un der this model, we have (Raj, 19646) 

o (eim 4* 2 

v^-w-ti~' x L~Z+*T 

2 11 + l Ni(N < ~ 2 )*<« 
L » ;>* 

If the sample is selected with replacement with pp to x, we have 

v Nixr 1 

v* = aX l 

= [2 (N - l)]~ l aX + ^ _1) + I ~ i)^ 1 ] 

t i>i 1 

" Thus the quantity A = V vpB — T^ppas is given by 

V V NiNjjxj - Xj)(xi°- 1 - a/" 1 ) 


= [2(AT - l)]- x aX 


l N ^' ~ll 


Xi + Xj 


The quantity A can now be examined for different values of g. It is 
obvious that A > 0 for g = 0 and 1. But, for g = 1, the relative increase 
in variance (with pps as the base) can be as small as 1/ (N — 1)- For 
g = 2, pps sampling will be found to be superior whenever 

y y NjNjjxj - ay) 2 ^ ^ 

^ 4" Xj 

For g = 3, A = (aX/2)[N/(N - 1)][Z 2 - (N - 1) V(s)]. In this case, ' 
pps sampling will produce a lower variance if CV 2 (x) > 1 /(N — r 1 
In order to make a comparison with equal probability samphng» 
have for this model 


Tr _N(N- 2)/ 

Kep " 2 V 


V N 

+ a lx 


^i) 


Hence, for g — 1, ppas sampling is superior to equal probability samp 
and is only slightly better than pps sampling. 


5.12 UNBIASED RATI0-1YPE ESTIMATORS 

Another line of research has been to modify the usual ratio j, 

, (a " d “ ot , the sam Phng scheme) so that a ratio-type t* 4 ’ „lly 

theesv , at 18 Un ^rased under simple random samphnS- . fio0 dP> ftr 
the estimator f = = n-Sfe/*) i s corrected for its bias («» 




further 


USE OF SUPPLEMENTARY INFORMATION 


97 


n d Hartley, 1958) to obtain an unbiased estimator, 
the following theorem. 


This is proved in 


Theorem 5.8 

In wtr simple random sampling , an unbiased estimator of R 
given by 

(N — 1 )n y — fx 


R = f + 


N(n - 1) X 


= Y/X is 


(5.13) 


with a large sample approximation to its variance as 

SJ 4- R 2 S X 2 - 2 RpS x S„ 
V(R )=-- 


(5.14) 


where 


r = n~ l S ( — 



nX 2 

R = 

\Xij 


proof We have 


E(f) 


E(n)E(xi) 

E(?i) — 



= R - 

[sing the result that ‘he sample ^“population covariance, 

' an unbiased estimate X»ted unbiasedly by 
>e find that Cov (r«*.) 1 

1)2 _ „) 



E (f + 1 ) 


R 


Cov (fV*) , Cov^ifi) „ B 

X X 


K X „ , r£e sa mple variance 

In order to obtain * ® * £ re duces to [y r( ' X numbers the 

note that in .^infinite. By the law <*£*. ^ , imiting 
pulation is assume in probability distribution 

idom variable f converges P _ b the sa me as t 

tribution of VS » - % " Thus, for large values of », 

VS » - «<* - X) ' '■ 


SAMPLING 


91 

of [y - ~ 


X)]/X is appr° x ^ mate ^ g * ven ^ 


.9 ^ A 


rtfi.a Q 



theory 


1 


Corollary 

rpi. i. rffe sample approximation to the mean square error of the biased 
ratio estimator y/t has been obtained (Theorem 5.3) as 


V t (R) = 


S v 2 + R 2 Sf ~ 2RpSJS l 


nX* 


Denoting the large sample variance of the unbiased ratio-type estimator 
as V t (R), we get 

... „ ... (A 2 - R 2 )S* 2 ~ 2pS x S v (R - R) 

Vi(R) - Vz(R) - nX 2 

_ [(R - ft) 2 -(R- ft) 2 ]S, 2 
nX 2 


where ft is the regression coefficient, pS v /S x , of y on x. Hence the biased 
ratio estimator will give a more precise result if the regression coefficient 
ft is nearer to R = Y/X than to R = E(y/x). The two variances are 
equal if R = R. 


Remark Since the expression for the unbiased ratio-type estimator \ 
involves f = S(yi/xi)/n, which is not simple to calculate when many 
items are involved, the unbiased ratio-type estimator is unlikely to be 
us in large-scale work. It may be appropriate in some specialized 

investigations. 


i^ n t Xact ex P ress i°n for the variance of the unbiased rati 
estimator has been obtained by Goodman and Hartley (1958). 

Further reading 

a ratio estimator an ° ther P rocedure o{ reducing the I 

the same size (callpH^/ IS U P m independent subsam] 
obtaining an unbiased er P e netrating subsamples), a procedi 
3 ; Wh entheregressfon r tyPe eStimator is Presented in Exerc 
t e usual ratio estimator^ ^° n * * S ^ near > it is shown in Exercise 

s etter than the ratio-type estimator. 





FURTHER use of supplementary information 


99 


5.13 DIFFERENCE ESTIMATION 

As stated before, the ratio estimator is at its best when the relation 
between y and a: is a straight line through the origin, that is, y — kx = 0. 
In case the relationship is of the type y — kx = a (constant), it is natural 
to try an estimator based on differences of the form y, — kx%. Such 
estimators are called difference estimators. Instead of estimating the 
ratio of Y to X and multiplying it by the known mean of x to estimate the 
population mean Y , we estimate the difference between Y and kX and 
add to this the known quantity kX to estimate Y. In simple random 
sampling the difference estimator and its variance are presented in the 
following theorem. 

Theorem 5.9 

In wtr simple random 
mean is given by 

with a variance of 

V(p) 

where k is a constant . 

proof Since E(y - kx) = Y - kX, the unbiasedness of p follows. 
Denoting yi — kxi by Ui, we have 

V{p) = V(u) = (1 - /) ^ 

where S u 2 — (N — l)"^ - w) 2 = (N - 1)“ 1 S[y< -Y- k(x< - X)] 2 
= S v 2 + k 2 S x 2 ~ 2 kpS x Sy 

■ 

This proves the theorem. 


sampling , an unbiased estimator of the population 


p = (y — kx) + kX 


(5.15) 


= (!-/) 


Sy 2 + k 2 Sx 2 - 2 kpSxSy 


n 


(5.16) 


Corollary 

In order to find the best value of k to use, we differentiateJW with 

respect to k and equate <t with derivative is''positive, 

population regression coefficient. S . h variance 

the variance would be a minimum for k = B and the minim 

is (1 - f)SA 1 - P 1 )/"- 

v / v variance of the difference estimator is 

Remark For k = R = Y/X, the va n square 

exactly the same as the approximation, V^XR), 
error of the biased ratio estimator y/ x ‘ 


sampling the 0rv 


100 

. T .. difference estimator is superior to the simple average jj « 

2 T o s/S) < 0 or m - M> < that is > 1 k B- between 

fd 2 b 7 For values of k outside this range, the simple average y WOuld 
be better. 


uc uvuuv* ■ 

Vom/iih In order to estimate the variance of the difference estimator 
from the sample, we note that = S(* - *)’/(» - D is an unbiased 
estimator of S u 2 . 


Remark The foregoing analysis shows that the method of difference 
estimation gives exact results which are simple to apply. Its relative 
precision depends on whether a good guess can be made of the regression 
coefficient B. In one practical situation it is not difficult to guess B. 
This is the case in which the auxiliary variate x is the character y enumer¬ 
ated on a previous occasion, in which case k could be set at unity. 


5.14 REGRESSION ESTIMATION 

Instead of making a guess of the value of B, the population regression 
coefficient, to form the more precise difference estimate, we could as well 
estimate B from the sample and use this value in place of k. Since the 
sample estimate of B is b = Sfa - x)(y { — y)/S(xi — x)\ the estimator 
obtained thereby is 

fi = y-b(x- X) ( 5 - 17) 

which is called the regression estimator. Since 6 is a random variable) 
exact expressions for the expected value and the variance of the regression 
estimator are hard to obtain. At the same time the calculation of b i s 
laborious, especially when many items are to be estimated from the same 
survey. The result is that the regression estimator (5.17) has not been 
hT]l!L eXt ? S1VeI f aS haVC the ratio esti mator or the difference estimator 

to its varil!e W wm beg n iven S SeCti ° n * lMge SampIe appr ° Xin “ t10 

Theorem 5.10 

Sam ^ * large sample variance of the 

# = y - b(x - x) 


*’* Given by 






„ iicf OF supplementary information 
further List 

00F since the sample regression coefficient b converges in proba- 
» a £ n ite value, namely B, the random variable Vn (b — B)(x — X ) 
converges to zero. Hence the limiting distribution of 

r h(x — X) — Y] = Vn [y — B(x — X) 

Vn[y'W _?-(b-B)(x-X)} 


will be the same as that of Vn [y - B(x - X) - Y]. Thus, for large 
n the variance of the regression estimator is 



ss + B 2 S X 2 - 2 B P S X S V 
n 


SA 1 - P ! ) 

n 


since B — pS u /S x - 

Remark This analysis suggests that we could use any random variable h, 
in place of b, where h converges to a finite value. But, it can be easily 
shown that the limiting variance is a minimum when h = b. 


Remark The simple average y, the ratio estimator Xy/x, the difference 
estimator y - k(x - X), and the regression estimator y — b(x - X) all 
belong to the class of estimators 

y - h(x - X ) 


where h is a random variable converging to some finite value. The 
variable h is zero for the simple average, y/x for the ratio estimator, 
k = a constant for the difference estimator, and the sample regression 
coefficient b for the regression estimator. In large samples the regression 
estimator is the most precise for estimating the population mean or total. 
But its use will be justified only in those cases where the gain in precision 
offsets the additional costs involved in computations. If a good guess 
of B can be made on the basis of previous information, the difference 
estimator will be as good as the regression estimator from the point of 
view of precision. It is the best choice in this situation. 

Remark It follows from the proof of the theorem that the contribution 
to the variance arising from the estimation of the population regression 
coefficient B through b is small relative to the total variance when the 
sample size is fairly large. 


102 


SAMPLING 


theory 


Further reading . 

L Refer t0 Exercise 36 for an alternative derivation of the Variance 

T F^nTraTTethods of generating unbiased ratio and regressi 
estimators, see Exercises 37 and 38. 


5.15 USE OF MULTIAUXILIARY INFORMATION 

The discussion so far has been restricted to the situation in which auxiliary 
information on just one x-variate is to be used for improving the precision 
of estimates. Frequently we possess information about several x- variates, 
and it may be considered important to make use of all the available mate¬ 
rial to our advantage. We shall therefore, present some methods of 
using information on several variates xy, x 2 , , x p . One method 

consists in forming a difference estimator of the mean of y's based on 
each x-variate and then combining them, using appropriate weights 
(Raj, 1965). Let 

U = y - hi(xi - Hi) (5.19) 


where is any constant. Let Wi(i = 1, 2, . . . , p) be weights adding 
up to unity. Then 

A = WiU (5.20) 

is an unbiased estimator of Y. Its variance is given by (Theorem 1.5) 

= XXCov 

* i 

Defining S uv as the covariance between u and v and letting 0, 1, . • • > P 
s an or t e variates y, x h . . . , x p , respectively, we have 


Thus 


Cov {U,tj) 


1 -/ 
n 


(£oo — kiSoi — kjSoj + kikjSij) 





WjWjOij 



(5.21) 




USE OF SUPPLEMENTARY INFORMATION 


101 


further 

u er e the matrix A — (flu), and w = (wi,w a , . . . f w 9 ), w' being the 
' nspose of w. Using the procedure (Sec. 1.10) of combining a number 
^estimators, we establish that the optimum Wi is given by 

sum of the elements in the jth colu mn of A~ l 
i ~ sum of all the p 2 elements in A~ l 


w ^ ere A~ l is the M atrix inverse to A. Using the optimum weights, the 
jjunimum variance is found to be 


7(A) 


1_W_ 

n sum of the p 2 elements in A~ l 


In order to estimate the variance of A, we note that 

A = V - £ Wiki(xi - Xi) 

t 

7(A) = 7 (y - £ = V S (p, - £ wfaxa 

1 i i 

and hence an unbiased estimator of the variance is given by 

V(p) S - (1 - f) £ S [y, - £ - a)] 1 (5.22) 



5.16 THE CASE OF TWO x-VARIATES 

The most frequent application will be the use of two z-variates. In this 
] case the formulas assume the following form: 

A = V — wiki(xi — X\) — w 2 ki(x2 — X 2 ) 

7(A) = ~ (1 ~ f)(S oo + Wi 2 k\ 2 Sn + Wi 2 k 2 2 Si 2 

n 

— 2wikiSn — 2w2kiSoi + 2wiW2kik2Su) 

The best weights are 


Wi = 


022 — Ol2 


On 4“ O22 — 2012 
The minimum variance is 


w 2 = 


an ~ ai2 


an 4" 022 — 2ai2 


Fw -; a ^ 11 * 


We shall consider the particular case in which the coefficients of 
variation of x\ and x 2 are equal to c, there is the same correlation po 



104 


between y and the K (i = 1 > 2 ) > and the are th e Population 1 

B- = Y/Xi. Further, let p denote the correlation coefficient b * 1 1 

ii and and let c„ be the coefficient of variation of y. Then, it is*'' 1 

to check that the best weights are wi = Wj = and ’ 

F(/» = - (1 - f) P | \ « ! (1 + p) + (c ! - 2 Pm> 1 
” L2 J 

If only one x-variate is used, the variance is 

F(A) = - (1 - + «o s - 2pocc 0 ) 

n 

Thus it is always better to use the second variate provided p differs from 
unity. A comparison can also be made with the case in which no x-variate 
is employed. It will be found that the use of a one-variate is justified 
if p 0 > 2 -1 c/c 0 . In the case of two x-variates the criterion is po > 
4 -1 (l + p)c/c 0 . 


5.17 MULTIVARIATE RATIO ESTIMATION 

Another method of using information on several variates is to form ratio 
estimates instead of difference estimates and weight them suitably- 
Olkin (1958) has considered this technique. It is clear that the estimators 
will be biased and only approximate expressions for the variances can . 
obtained. Denoting by Ui the ratio estimator Xjy/ & based on 
i variate x { , the weighted estimator of Y is A = ZWiXty/x* = 2u, ’ Al " 
where = 1. The expected value and the variance are given y ; 


f 


1 


E(fi) = ZwiXiEiRi), 

V(P) = mwiWjXiXj Cov (Ri,Ri) 

The analysis proceeds on the same lines as discussed before for th 
of the difference estimator. The large sample variance of t ce 
estimator is found to be the same as the exact variance of the diB 
estimator when = R { . 


Remark A question may be asked whether to base the ratio est *f\ te r 

Lw! 68 °^° n P + Exercise 39 shows that it is always 
include an additional variate. 


b« 


8 RATIO ESTIMATION IN STRATIFIED SAMPLING ^ & 

divided uni™ di80US8 the sltuatlon in which the pop'd®*' 0 " 

Up lnt0 a number of strata and the population total * 






*!»■' -WWU 



c . 


& supplementary information 


105 


by 


the ratio 


method. Two different estimators are given. They are 


* '■ x h 




u 

h 


2N h $ h 


(5.23) 


case a separate estimate is made for each stratum total and 
la the nrs com bined estimate is made for the population total, 
in the me thods of Sec. 5.3, the bias in f c relative to the standard 

“ rori5glvenby mu jim < cv NhSh ) 

r(?e) ff ( R ) ^ 


, hias in the combined-ratio estimate will not be important 
HeI f npffic ient of variation of X is small. In the second case 
= 2B(?,)/[2V(?,)\» If the bias and the variance of 
f ( ' from Stratum to stratum, the numerator is L times B(Y h ), 

d °s D l denominator is VI times <r(f.), which shows that the ratio of 
ihe bL to the standard error will increase as the number of strata 
■ h „ This analysis shows that the bias in the separate-ratio 
T iTnJ not be negligible if the number of strata is large. With 
Tegud to the variances of the two estimators, it may be noted that a 
first approximation to the variance of t. is easy to write down by adding 
F( Pi) P over strata. For the combined-ratio estimator th ® 

ZN,y,/If and 1NA/N take the role of y and a: of Sec. 5.5. Hen 
approximation to the mean square error of Y c is given by 


MSE (? e ) = E [£ ~ 

= V (1 - /»)(£** + R*Sxh 2 - 2 RpkSykSa) (5.24) 

** Uh 

hi stated before, 

MSE (f.) = V — (1 - f h )(S uh * + Rh 2 Sxh 2 - 2 RhPhSyhSxh) (5-25) 
* n K 


The two expressions for the mean square error a samp i e 

f the strata ratios R h differ considerably from each o e , . 

Lze w ithin each stratum is reasonably large, the separa ^ 

be better. Otherwise, the combined-ratio .estimate should be 


Wher 


—. reading , 

, Whe & it is considered important to form a type of combined unbias 

procedure of Exercise 40 may e us combined- 

tin Gn * wo un its are selected from each stratum an com puta- 

S* *■•*« is formed, Exercise 41 presents a short cut to the 

ot toe variance. 





SAMPLING THEORy 

" „ . es 26 and 28 present two methods of forming unbiased r, ti „ 

j Exercises to arxu * * 

estimators in stratified sampiig- available but the strata sizes are 

4. When the strata frames ision by making an adjustment 

known, it is possible to achl ® „ g ra[ld om sample from the entire popula- 
to the sample average based on a random^ 

tion. This is shown m Exerem ^ ^ ^ negat i v e, it is advisable to 

5. In case the correlation b estimator. The product 

use the product estimator rather than the ratio 

estimator is considered in Exercise 105. 


REFERENCES 

Goodman, L. A. and H. O. Hartley (1958). The precision of unbiased ratio- 
type estimators. J. Am. Stat. Assoc., 53. . 

Lahiri, D. B. (1951). A method of sample selection providing unbiased rai 
estimates. Intern. Stat. Inst. Bull., 33. 

Midzuno, H. (1950). An outline of the theory of sampling systems. Ana. 

Inst. Stat. Math., 1. t 

Olkin, I. (1958). Multivariate ratio estimation for finite populations. l0 " 

metrika, 45. , 

Raj, D. (1954). Ratio estimation in sampling with equal and unequal pro a 

bilities. J. Ind. Soc. Agr. Stat., 6. 

_(1964a). A note on the variance of the ratio estimate. J- Amer. Stat 

Assoc., 59. 

-(19646). On sampling with probability proportionate to aggregate size 

J. Ind. Soc. Agr. Stat., 16. 

(1965). On a method of using multiauxiliary information in samp 

T A _ Ci.i A ~~~~ Af\ 


surveys. J. Am. Stat. Assoc., 60. 





chapter six 

campling AND SUBSAMPLING 

% CLUSTERS 


U INTRODUCTION 

We have assumed so far that it is convenient to take a sample of units to 
be investigated directly from the entire population or from within strata. 
While this may be true in some kind of surveys, it is generally not so when 
we are concerned with countrywide investigations. The principal reason 
is that no usable list (called a frame) of units to be enumerated generally 
exists from which to select the sample. As an example, suppose it is 
desired to conduct a sample survey in Greece in which individuals would 
be asked at the time of interview whether they worked for a living last 
Week. It is not possible to take a simple random or systematic sample of 
persons from the entire country or from within strata, since there is no 
such list (verified to be correct) in which the Greek people are numbered 
rom 1 to N. And it would be impossible to make such a list. Even if 
8Uc h a list existed, it would not be economical to base the enquiry on a 
^uiple random sample of persons because this would require interviewers 
10 visit almost every commune in the country and resources do not permit 


SAMPLING 


108 


theory 


it All these considerations point to the need of selecting larger units ur 
Clusters rather than elements (individuals m this case) directly from the 

r-v _ c/xl/Anlinnf +V10 Hflmnlft would no to QAAllltn — 1* 


or 


population. One way of selecting the sample would be to secure a 
of communes (which is readily available), take a probability sample of 
communes, and enumerate every one in the sample communes. This is 
called single-stage cluster sampling. (The clusters are communes and 
the selection is made in one stage only.) Or, instead of interviewing 
every individual in the commune, we interview only a sample of them. 
This is called two-stage sampling, since now the sample is selected in two 
stages—first the communes (called first-stage or primary sampling units), 
and then the persons within communes. This is also called subsampling, 
since a further sample (of persons) has been taken from a sample (of 
communes). A more convenient way of subsampling the commune 
would be to take a sample of census enumeration districts (ED’s) and 
select a sample of households from each selected ED. This will be three- 
stage sampling—first the commune, then the ED (second-stage unit), 
and then the household (third-stage unit). The interviewer visits the 
Households in the sample and collects the information required. The 
advantages of this procedure are: lists have to be prepared for the selected 
primary (psu) and subsequent-stage units only; it is easy to check the 
correctness of the lists; the sample gets concentrated in the selected psu’s 
and this reduces costs of travel, etc. 


6.2 SINGLE-STAGE CLUSTER SAMPLING 
0 

No new principles are involved in making estimates when a probabilit 
sample of clusters has been taken and each sample cluster is enumerate 
is Z! y l e ’’ there , 18 no subs ampling). A problem to be consider* 
cost of clT? S1Z * °/ the CluSter ' This wiU naturally depend upon t 

E ln i7 m f ion front dusters of different sice and *1 
g anance. We shall begin by proving the following theorem- 

Theorem 6.1 

° fn d «*"• «* running M etameoU.P. 
0 / AT Asters, on estimate of the population total Y is **' 


and 


* i 


V(Y) = SI A _ n\ 1 

*v 

(i _n\NM ~ l 

* V n)~n~z~i -Ml + (M - 1>] 






SAMP L |NG and SUBSAMPL,NQ 0F CLUSTERS 


> 


wh# 6 


1 

N 4 

S 2 = - jge) 2 

v NM — l 


in 


F = I V K . ^ 221^ 

TV 


M 


a nd P ^ the intracluster correlation coefficient (Sec 1 Q'i 
the total of y for the ith cluster in the sample. ' ’ d Vi 


— 3*, is 


peoop The relations (6.1) and (6.2) are obvious. To 
wc have 


prove (6.3), 


= X X (mi -?.)■+ 2 y y ( te _ F, )fe _ ?i) 


* J<Jfc 


Hence 


“ - W + (M - l)(iVilf - n p>S f 2 

-.(NM - l)£„ 2 [l + (M - i) p ] 

V (P\ = A __ NM - i 

n \ n) N - 1 ^ - !)p] 

N 2 ( n \ 

n V “ Nj MS ^ 1 + (M - 1 ) p ] 


(6.4) 


Remark The variance in cluster sampling depends on the number of 
\ Cl f7 “ the sa ^P the variance S v \ the size of the cluster M, and the 

) mtracluster correlation coefficient p. 

R *™! k I{ ’ i 1 nstead of sampling in clusters, a simple random sample of 
M elements be taken directly from the population, 


V(t) = 


(NM)* / 

j nil/\ 

nM \ 

~nm) 

N* / 

n\ 

— ( 1 — 

-) MSJ 

n \ 

N) v 


S'. 


Th 

y betwe °F^ e Same num ^ er °f elements in the sample, the relationship 
/ whor. * ( vai *iance when clusters of size M are taken) and V e (variance 
n dements are taken directly) is 


Ge 


Orally 


V e - [1 + (M - 1 ) P ]V t 


(6.5) 


geographically contiguous farms, stores, establishments, families, 
Us for the same number of elements in the sample, cluster 


l 


no 


sampling 


theory I 

sampling will give a higher variance than sampling elements directly / 
But the real point is that it is far cheaper to collect information on a f 
per-element basis if sampling is done in clusters. If p is negative, both 
cost and the variance point to the use of clusters. 

Remark An expression for the intracluster correlation coefficient p can 
be given in terms of 

NM*' = 12 ( Vii - ?.)* 



_ , Z(f, - 

N 

and 

_ my, - ?<)« , 

\ 

NM 

in which case 

= fjt 1 + <r w 2 

By definition 


NM(M — 1)^2 

" ? tZ (yii “ ?e) Y “XX fo* “ f *) 2 


= NMW - NM*' 1 

or 

(M - i) r - M "* % ~ 


which means that 


P = 



ov — 


/ \M — 1 ) 


Further reading 

f • See Exercise 43 , 

fu “^ imum “ e of the oiuster is fottn ' 
suggests a useful way^f‘ 0 °" e “ h element is available, Exercis 

8 clusters of two elements each. 

6,21 estimation of PRophot 
Suppose it is desired . T '° NS 

arSTral,? When P ™ P ° rtion of elements belong* 

jth element S f m ^ e °f n clusters ? nsi8ts °f N clusters each of 91 
easy to note c * U8 ter belong ® e * ected - Defining ya a 3 , 1 ' 

hat y i gives the total 88 t0 the class and 0 other***' 

total number of elements in the ftk 






(6.7) 


SAMPLING AND SUBSAMPUNG OF CLUSTERS 
i belonging to the class. Hence, the proportion P win be 

P m -i- V y Y 

Apples Theorem 6.1, an unbiased estimate of P is 

1 N 

n 

v(P) = ~(P< - py 

n\ N) Ar_ x 

since y t = Mp { f = MP 

If, however a simple random sample of nM elements could be taken the 
vanance of the sample Dronort.mn on Tirsviilr] L. 0 


NM n Syi ~~ ZHjSyi = - .<? 


nM 


*Pi 


( 6 . 8 ) 


sample proportion p would be 

-j-Yl HPQ_ _ 1/ n\ 1 npq 

nM\ NM) N - 1 ~ n\} ~ JfjM J/~[ 


P = 


jjjt £ _ 

SMi 


Sifr ' ~ SM< (6>9) 

and a first approximation to the mean square error of £ is obtained as 

MSE(£) = ~~(l - ~ P) 2 r, lft . 

ilf 2 n\ / N — 1 (6.10) 

where 


M = (Sec. 5.5) 




6,2,2 ESTIMATION OF EFFICIENCY OF CLUSTER SAMPLING 

If the sample is selected in clusters, it is possible to estimate from the 

sample itself the variance that would have been obtained if elementary 

jHuts had been selected directly from the population without using 

8ters. Suppose the cluster size is M and a wtr random sample of n 

^usters has been selected. The total of the t'th cluster is F, = MY 

V heorem 3.2, an unbiased estimate of F(F) from cluster sampling is 
j Pven by 


N 2 


(1 1 \ S( yi -y) 2 

\n N) n - 1 


V‘ = Us, is the total of the ith cluster in the sample and 
* (!/*)«„,. The variance of the population-total estimate based on 


sa MP L | Nq 




' -' \nJVl n 1TX / y 

= ^ (s JW/ jv ’ Jtf-1 

The purpose is to estimate V,(f) from the cluster sample. By The^ 
3.1 the sample mean is an unbiased estimate of the population me atl 

which gives 

2 fo* - ft ) 2 2 2 - F<)! 

^ L -ir —‘^-n — 

Similarly, by Theorem 3.2, 


ES 


(ft ~ SyM 1 _ 3( f,- - 2Ki/iV) z 


71 — 1 


.V - 1 


* 

Thus both terms comprising V x (f) can be estimated from the cluster 
sample if a record is made of observations on all elements included in the 
sample clusters. 

6.2.3 OTHER ESTIMATORS IN CLUSTER SAMPLING 

The unbiased estimator (6.1) will not be found to be very precise when 
the cluster sizes vary. This is clear from the expression for its variance, 
namely 

V(f) = W- - 

\n N/ N - 1 

If the average per element F, does not differ much from cluster to cluster 
but the cluster sizes M { vary considerably, the quantity il/,F, will be very 
variable, and thus V($) will be large. In this situation one may use 
the mean of the means, namely 

ti - - Sy< 
n 

where Mo — 2-A/* is the total number of elements in the population- 
this estimator is biased with a bias of E(f x ) — 2 Af»F< = 2(ikf - Mi)f» f 

Theorem 31 avera S e number of elements per cluster. By | 

V{t x ) = ~ (WW 1 

\n NJ N - 1 




113 


SAMPLING and subsampling of clusters 

0 

which would be small if the Y t do not differ much. If the bias in f\ is im¬ 
portant, the sampling scheme can be modified so that Y i becomes unbiased. 
This will happen when the clusters are selected with replacement with 
probability proportionate to Mil In this situation the variance of 
would be given by Sec. 3.14. That is, F^) = M 0 2Mi(Yi - Y e y/n, 
where Y e is the average per element in the population. A comparable 
estimate would be the ratio estimate ? 2 = ( M 0 NSyi/n)/(NSMi/n ). A 
first approximation to its variance would be (Sec. 5.5) 



the quantity R = Y/X occurring in the variance of the ratio estimate 
being SF./Mo = Y e , the population mean per element. 

Further reading Another method of selecting two different clusters 
with varying probabilities consists in making two independent selections 
with pp to x and accepting the sample when both units are different. If 
the simple average of the means is used as the estimator, its status is 
studied in Exercise 11. 


6.3 MULTISTAGE SAMPLING 


We now ttirn to the situation in which the sample clusters are subsampled. 
The first thing to be understood is the formation of estimates of population 
totals, means, ratios, and proportions from a given subsampling (or multi¬ 
stage) design. The basic principle is that of building up estimates from 
the bottom (last stage units) to the top. For example, suppose a 
commune contains N ED’s from which one ED is selected at random. 
Let the selected ED contain M households from which m are selected at 


random. Every household in the sample provides information on y 
(which is, say, expenditure on bread last week). The sample mean 
y = (l/m)% estimates the average expenditure per household in the ED 
and My estimates the total expenditure on bread in the ED. But since 
this ED was selected at random from the N in the commune, the estimate 
of the total in the commune is NMy. Now suppose that not 1 but n ED s 
were selected without replacement with equal probabilities and t e 
selected ED’s contained M h M t , . . . , M n households, respectively, 
from which random samples of mi, wi 2 , . . . , win households were a en. 

Then, MAl/m,)Sy = M<y< (i = 1_™) will estimate the totals of 

the sample ED’s, and therefore <\/n)SM$i will estimate the average per 

ED, and hence f = (N/n)SM,y, will estimate the total expendior 

bread f 9 r the whole commune. We can also estimate the total numb 


114 


% fly 

households in the commune. The quantity (1 /n)SMi estimat. 
average number of households per ED, and hence )t s= #/i 
estimates the total number of households. Dividing the ' • ^ 
expenditure f by X, we get an estimate of the average expends 
household. Algebraically, we can find the expectation of f as ^ P6r 

_ IT Tir ^ - . . 1 / 1 v 


° ~ cApeutauon oi / ag 

£(?) = EiE 2 (f) = 22, [tf i^,E 2 ( S ,)l = E l U - SM<?\ 

/, v V " ' > 


r *(V,) 


6.3.1 CALCULATION OF VARIANCE 
The variance of ? will hp nk+ • j , 

By Theorem 1.8, ° bt “ ned h * using the conditional argument. 


Here Uj and V 1 r 2 t / 1 *r v iKt(Y) 

f U ® cl «tions of m, Pr ^ nt ^ C ™ d ‘ tional expectation and variance over 

^ble ^mr,l (like f Strata); an <i v’dt^'V ' 0 ” 1 *“ ED ’ S which "* 

-amplee „f, ED , from over aU pos- 


= tr-sYt 

n 

v iE 2 (^) = iV 2 (l _ i_\ 



and 

This 


gives 


i! (i ~ i) 

S«i J « -P,)a 

^ i £ 

n }{ 4 _1 \ 

i_l V Af, ) 






no 


a £ (i_ __ i\ 

1 i*... ' W * *?J 


8vi 2 + ft* 1\ 

COtop «uent'^'Images.::: U “ s) 


COtt -Horn t S h ;;^ugis the 


A a 




Va riabiiif, 8 * 8Um °f two component' n 

y °f second-stage units with 10 V 





V* 


) 


^PUNG AND SUBSAMPLING OF CLUSTERS 

m 

psu's and the other arises from the variance of psu totals if » a- , 
cample of second-stage units can be taken the v «rio. * lf a dlrec t 
n ot come into the picture. ’ ce °* P su totals will 

6,3.2 ESTIMATION OF VARIANCE 

To continue with the previous iUustration, we shall now estimate V(f) 
from the sample. It is natural to try the variance het.. . 

of the ED totals. We have between the estlmat ^ 

EJS (m& - i = E [sM<W - i (SM&)^ 

Using the formula E(x 2 ) = E\x) -f V(x), we get 

+(I-£)*,] 

and B i (SM&y = + S M<‘ (i - ±>j 

= nf 2 + n(- - ~Wi-~ 

\ n N, 


.2 

wi 


> 


Hence 




.2 


ES ( m & - i Mr*) 1 = ( n - 1) 


2(Y { - y)' 
w -1 

+ 


(5 ~b)^h s (**- ;»*■»•)’ 

But 2 (— - —) ~ SW (- - ~) S., 1 

^ m>\) n \wi» Mj 

+ - SM, 1 (- - i) 

n \m< ikZi/ 

^ ym*s\xts\/ 3 fV \a ■Trxl 1 Anri rv ^V»AAi»nm 


1 


S\D\ 



116 


SAMPLING THEORY 


Theorem 6.2 

Let n psu’s be selected at random from a population of N psu’s. Let 
random samples of mi (i = 1 , • • • > n ) second-stage units be taken from, 
the Mi (i = 1 , • • • , n ) second-stage units in the selected psu’s. Then 


( 6 . 11 ) 


and 


where 






SM& 




n 

y(P) 

■ = 

N 2 (- 
\n 

-i) &2+ 

-Im 

n ** 

m 


N 2 (- 

— s& 2 + 

N 

- SM* 


\n 

NJ 

n 

Hi = 

1 

TO; 

Sya 

& 

to 

II 

1 ^ 



.2 


X 


.2 




( 6 . 12 ) 


(6.13) 


■ ry r = ±lr< 

s ~=W=-i2 to ~ ?()! ** = ( m -n Sm ', 


o .2 = 

«iw — 


TO; — 1 


S(ya - Vi) 2 


Remark In case the same number of second-stage units is taken in the 
sample from every psu, m,- = m. However, in many practical applica¬ 
tions the mi are determined such that 

? ~n^m i Syii = k$$ Vii 

which means that the estimate is made by simply adding the observation 
from all the psu’s. In this case 


i 


mi 

~Mi 


kN 


n 


which gives a constant sampling fraction from each psu. 


Remark If all psu’s have the same number of second-stage units M & 
a constant number m of thetti sampled from every sample P sU > 


d 


? = SSy« 

nm ” fj. 


n 


m 


where f t = ~ T >h ' g 


an( * = N 2 S h 2 -—— N*M 2 

where S w 2 = 2 S^/N. 


N 

(1 fi)^y>^ 


n 


mn 







SAMPLING and subsampling of clusters ^ 


■ • * C - Cin + c 2 nm, in which the two 

components are proportional to the number of psu’s and the number of 

second-stage units. It is then possible to find the best values of n and 
m for which 7(F) is a minimum for a given cost C. Using Lagrange’s 
method of undetermined multipliers, we construct the function 
) = 7(F) j- X(cin + c%nm — C). Differentiating G with respect 
to n and m, equating the resulting expressions to zero, and eliminating 
X, we have 


m = MS W 


0 Sb 2 - MS*, 2 )* 


(6.14) 


The best value of n is obtained from the relation C = an + c 2 nm by the 
substitution of m from (6.14). Equation (6.14) shows that m, the 
number of second-stage units to be taken from a psu, should be larger 
if S W) the variability of y within psu’s is larger, or the cost per primary 
unit (ci) is larger, or the cost per secondary unit (c 2 ) is smaller, and if 
S b 2 , the variability of psu totals, is smaller. 


RciriQTk It must be pointed out that the above analysis holds only when 
the cost function employed is appropriate. If travel between psu’s is a 
major component of field costs, a more relevant cost function appears to 
be (see Exercise 100) 

C = c 0 Vn + cin + c 2 nm 

since travel between n points (Mahalanobis, 1940; Hansen, et al., 1953) 
is represented more appropriately by a quantity proportional to y/n. 
The simplest cost function, which is not realistic in large scale surveys, 
would be C = cnm. In this case the cost is simply proportional to the 
number of second-stage units. It is obvious from the expression for 
7(F) that, in this situation, n should be as large as possible so that m = 1. 


M SELECTION OF PSU’S WITH UNEQUAL PROBABILITIES 

We shall now prove a very general result applicable whenever the psu’s 
are selected without replacement. Let t» be the probability that psu Ui 
w in the sample of n out of N. Further, let im be the probability that 
both Ui and U, are in the sample. Each selected psu is subsampled in 
a known manner, whatever the number of stages of subsampling be 
Let Ti be an unbiased estimator of the «th psu total (F,) based on sub- 
sampling at the second and subsequent stages with V 2 (Ti) = <r, 2 and let 
®s(tf, 2 ) = ^.2 Calling this sampling method scheme A, we prove the 
Allowing theorem. 



118 

SAMPLING THjor, 

Theorem 6.3 

Under scheme A, an unbiased estimator of the 'population total Y 

Y = s — 

*« ff^en by 

with 


(6.15) 

F(f) - £ fo* _ Tij) | 
and 

V(?) * f y f ?gV - 7T»A 

^ - ZA* . v o-v 2 

^ rj 4 T< 

(Ti 7VV 

(6.16) 

V X„ ) 

PROOF 

It —-) + s— 

Xir< T i / Ti 

(6.17) 


E< -?) = EtS^EUTA = E << Y < v 

** «U0 - K by (3.32) 

Since £,(?), syj/ - 

* * ~ * (T ' T > ~ '*(/.< - 


2/jAj) 2 . Again 


V,(?), S 2. £lVs(f>) v*» 

n(5:w|I , lV 1 4, “ 

But V Tii ' U ~ *j - *«s> SSLra - /ft _ TA 

U J 


Hen 


ce #£ 


. ■ <s - St ♦M: .a j 

• ffiisV" n. . W f) 

- B«‘S) -x- J) 


(6.18) 







> 


sampling and subsampling of clusters 

ES' - Ti ij (li _ TA‘ _ 


119 


given in (6.17). 


Hence JSO ' V ~ " 11 ~ — — 1 = 7(f) -^.i 

But 

*"a 

Hence V{9) is estimated by the expression 

Corollary 

1“ case of * wo_8tage sam P lin 8 when selection is simple random at both 
stages, we have 

n 

t< = n 

_ n(n - 1) 

” N(N - 1) 


T, = MiQi 

S-F,)" 


Substituting these values in formulas (6.15) to (6.17), we prove Theorem 

6 . 2 . 

Remark Note that the first term in V(¥) represents the variance in 
unistage sampling (as if the psu’s are measured Without error), and the 
second term gives the contribution to variance from subsampling of the 
psu’s. The same remarks apply to the estimate of variance. 


Remark One of the requirements of this scheme is the existence of 
unbiased estimators of o’,- 8 . Hence systematic sampling cannot be used 
at the second stage if an unbiased estimate of the variance is desired. 

Remark A comparison of Eqs. (6.16) and (6.17) suggests a useful method 
of estimating the variance in multistage designs. This is proved in Sec. 
9.9 and Exercise 56. 


) G - 5 SELECTION OF PSU'S WITH REPLACEMENT 

Another sampling seheme, which we shall call scheme B, consists in 
selecting a sample of n psu’s with replacement with probabilities p, 

^ 1.2, . . . , N) , 2p, = 1. An independent subsample is taken from 

ever y selection i (whether a repetition or not). The quantities T< and 


**=*«_-, 


= Vi{Ti) are defined as in scheme A of Sec. 6.4. 
following theorem. 

Theorem 6.4 


SAMPLinq th Eobv 
We then pro Ve the 


Under scheme B } an unbiased estimator of the population total ’ 

^ (liven by 

r = -s-‘ 

» Pi (6.19) 


and 


vit)= ll*Q- Y y+-r+= i -vZ 

/ n L p ( n pi 

— s (Ti_ i r.v 

"("-D Xp< n S J t ) 


( 6 . 20 ) 


( 6 . 21 ) 


PROOF 


E(?) = E!~ s~ = y 
S (i l 1 of independence of random vari 

*r» T /VV 


= « 2 p* — rV 

r </pi areTnH^ ‘ S giveii by ( 6.201 

"“biased estimator™ 1 ’ W ® estab lish'fro^Th Sin ° 6 the random . v 

tor »ven by (6 211 ° m Tkcorera 3.4 that 7(f) 

Cowlla* ' 

■Let the nan’o u 

? f 8ize *i, 2*. = ^ Ua i e ! se| ected with 

“oseholds is t a ^ en ‘thin a selected*^ 0 abdit ' es V> based on n 

roin the Then SU & S * m ^* e ran dom samf 


T< * 


. _ j. s M<Si 

= (Y, " Pi 

Jj he <WUna tot h " P ' 'ft ~ K ) + ~ y (}_ _ J_\ , 

oaenre of si»« ’ 18 C l0se n such that 

126 *<*the number of he 

\ 


SAMPL' nG 


and subsampling of clusters 


121 


the number of persons enumerated at a previous census, the quantity 
I pi may be nearly constant. In that case a constant number of house¬ 
holds will be taken in the sample from each psu. This is an important 

requirement in some surveys. 

Remark An unbiased estimator of the variance is calculable simply ‘ 
from the quantities Ti/pi, and within-psu variances <r, 2 need not be 
estimated. Thus, if desired, systematic sampling could be used at the 
beginning of the second stage of sampling. For example, in a two-stage 
design, villages could be selected with pps and households within villages 
could be selected using systematic sampling. 


Further reading 


1. See Exercise 55 for a better estimator based on distinct subunits. 

2. Instead of subsampling a psu independently X times (the frequency 
of its occurrence in the sample), it may be better to take a wtr random 
sample of m\ subunits from it. This is done in Exercise 46. 

3. See Exercise 14, in which it is shown that it may be better to select * 
the random starts in complementary pairs rather than take them 
independently when sample selection within psu’s is systematic. 


6.5.1 SELECTION WITH REPLACEMENT (SCHEME C) 


Instead of subsampling a psu independently every time it enters the 
sample, it would be cheaper to subsample it only once (like scheme A) 
and weight it by the frequency with which it occurs in the sample (Raj, 
1954). If X< is the number of times U < appears in the sample, we use the 
estimator 



( 6 . 22 ) 


where 


E(\i) = npi E(\i 2 ) = npi( 1 — p») + n 2 p< 2 


Cov (X,-,X;) = —npipj 


We have 
And 





n*-> pi n n*-i \pi ) 


which is larger than V($) of scheme B (Eq. 6.20). 


122 


sampling 


theory 


6.6 COMPARISON OF SCHEMES A AND B 


Let the without-replacement sampling scheme of psu’s be such th 
n = npi. (An illustration of this is given in Sec. 3.18 where the p 8u ^ 
may be randomized before selection to ensure that no m is zero.) fh 8 
the two unbiased estimators have the same form. The component of 
variance arising from subsampling of the psu's is also the same in each 
case. Thus a comparison of the two schemes in multistage samnli 
reduces to a comparison in unistage sampling, which has been discuss^ 
in Sec. 3.23. 


i 


6.6.1 COMPARISON BASED ON THE SAMPLE 

Given the results of a multistage survey in which the psu’s have been 
selected with unequal probabilities without replacement, it is possible 
to estimate the variance that would have been obtained if the psu’s had 
been selected with replacement (Raj, 1964). 

This is done by esti ma ting 



(6.24) 


from the sample, where “wr” stands for with-replacement sampling and 

the npi of the wr scheme equal x t of the wtr (without-replacement) 
scheme. 

We have 







sampling and subsampling of clusters 


123 


Hence 


Thus 


()as) 

P ""«[ P "' + S '(* “*)] (6.2«) 

By comparing t^wr with t^ w t r) we can make an estimate of the gain due to 
the use of without-replacement sampling. 


or 


6.7 STRATIFIED MULTISTAGE SAMPLING 


The theory discussed in the earlier sections is applicable when psu’s are 
selected from a stratum. No new principles are involved when the 
object is to estimate the total of a population divided up into L strata, 
and sampling within one stratum is independent of sampling within 
another. The estimates, as well as the variance, are simply added, and 
the same holds for variance estimation. For example, suppose n* psu’s 
are selected without replacement from the Ath stratum, and a simple 
random sample of m*. subunits is taken from the M hi contained in the 
ith psu there. Then the population total Y may be estimated by 


and 




Ti 


m t 


h 



Formulas (6.27) and (6.28) are obtained by setting 


(6.27) 



(6.28) 


Mi a 

Ti = M& = — Syu 

mt 

in Theorem 6.3. An unbiased estimator of V(T) is obtained by adding 
the expression in (6.17) over strata. 


124 


sampling 


theory 


It is of great interest to note that a particular choice of the m { f rom 
the Mi subunits makes the estimate ? very convenient to calculate. j n 
fact, for 


(*) -i 


(6.29) 


the estimator becomes 


t-\y.SS» 




(6.30) 


which means that observations over the subunits are simply added un 

^tThefo Sard the PS “ ° r the stratum t0 which these belong. A> 
rZr? T estlmator is “Ued “self-weighting.” The quantity 

Thus, if it is desired to tat* exi ? ecte< * sam phng fraction for subunits, 
the average) from each strat ° 6 Sample 1 P ercent of the subunits (on 

ask the field ™ "r * = 1/10 ° in (6 * 29) “ d 

the ith psu, where t is k k & Sampllng fraction of 1/(100 n) from 
strata, strata estimates ? different are used in different 

u a have to be properly weighted. 


is evaluated. e rcise 47 in which the gain due to stratification 


M EST,MAT 'ON of wtios 

^ uuistgtge samnli 

total by e8ti »ated by dhriing th UnitS ** ' S kn ° Wn ’ 

^ fiouseh old Tb ’ S ‘ 8 dlff «ent from eSt ‘ mate ° f the 1 

and household r0In a tw °-staee Hn • lmatln §> say, the avera 
number 0 f h 0 S ^ tbe sec °nd-stae' S1SQ in villages are 

but We°r b0ldS * the ptlr tS - The — * tha 

the mean will a « 6 es ^ ma ted f rom + / 0ri not be known < 

Thus the es 

re on furnit Ure +1^ tbe rati <> of y 10 ° f two ran dom var 
an biased esti^ ^ estim ator » J- pendit ure on foods, to s 
of T iQf °C r o e 18 * ad * o S* ^ ^ a ratio estimate 

’ Irx ah these a w ben the <,« Ration to improve th 

81 tuations ratin m ^ e * s selected throu 

estimators will be invo 


l 


_ < 

i 







AND subsampling of clusters 


125 


SAMP l,P,G 

develop theory appropriate to these situations. Two-stage sampling 
n ° rl be considered in which a random sample of m, subunits is taken from 
fhe M» contained in the ith psu which has been selected with probability 
* in case of wtr sampling and with probability in a sample of size 1 in 
case of wr sampling. 


68 1 SAMPLING WITHOUT REPLACEMENT 

If the psu’s are selected following scheme A of Sec. 6.4, an estimator of 
A = Y/X will be given by 


ft _ ? 

S(M&/in) X 


(6.31) 


where X are unbiased estimates of Y and X, respectively. By using 1 
the corollary to Theorem 5.1 (Sec. 5.3), we find that \B(R)\/<r(R) < CV(X). 
Hence the bias in R will not be important if CV(X) is small. The mean 
square error of R about R is E(R - R) 2 = E[(? - RX)/(X + 0 5X)] 2 
evaluated at 6 = 1. By setting y = t and x = X in Theorem 5.3, a 
first approximation to MSE(.ft) is given by 


MSE(fi) Rxy = i; V(? - RX) 


1 

X 2 


[ s - M<(g, - 

L 



Now setting — Mi(y% — Rxi) in Theorem 6.3, the variance involved 
can be calculated and we get 


f 

X 2 MSE(R) = ^ ( t {Kj — 7 Tij) 







(6.32) 


As an estimate of MSE(fi) from the sample, we may calculate, on using 
(6.17), the quantity 



(6.33) 


1 It may be noted that Theorems 5.1 and 5.3 are quite general, as is shown, in 

Exercise 107, 


sampling theory 

Corollary 

If the psu’s are selected with equal probabilities without replacement ^ 
have TFiKj - m = (n/N )(1 - n/N)[l/(N - 1)] ’ We 

p = SMfli 

, SMiXi ( 6 - 34 ) 

i [Y< - Rx < - (?, - RX,)) 2 = N% {Y, - RX<y 


Hence 


MSE(ff) = fl - J—3_ V (Y _ Rxv 

\ NjnX 2 N - 1 L 


+ 


1 V Mi 2 / rn, \ „ 

niVZ 2 Z V 1 “ Mj Sui 2 (6 - 35 ) 


where 


s “ <! ~ f^I 2 &«/ - r< - «(*„ -i, 


)]’ 


( 6 . 36 ) 


And an estimate of MSE(S) from the sample is given by 
A n\ N* 1 

V at) - RMiii ) 1 

-i -JLe Ml. (\ _ m '\ 1 

np m, v F,/ ^Try - V< - «(*« - *)] ! (6.37) 

form?=S;®”; tant ’ the estimator 4 reduces t0 the 


Corollary 


estimatoris ^ “ rand ° m *° *«■•* the mean per subunit, the 


P = ^Wj/t) 
iSAf, 


( 6 . 38 ) 


2ii ° = b TT i iT (6 fJ by SUbstituti “g «« - t in the equation. 

And a first ann * 411 *■ P°Pulation mean per subunit. 

And a first appronmatmn to MSE(fi) is obtained as 


MSE(E) 


1 - — 

N 1 


nX 1 


N - l 2 ~~ R Mi) 2 


+ 


sa^-SW (U " 


s “ “ F^T l - ?*)* 


(6.40) 


where 




a nd subsampling of clusters 


127 


S Al*P L,Nfi 

) a sa®pl e es ^ ma ^ e meai1 square error, we take 
n\ N*_ sMm - Ry 


(i 


+ 


N o M < 2 (. _ mA 1 
nX 2 b im V 1 mJ nit - 1 S ( yii ~ y 


ft) 2 (6.41) 


further reeding See Exercise 49, in which an unbiased estimator of the 
ratio is obtained from a different design. 


II 


6.8.2 SAMPLING WITH REPLACEMENT 

If the n psu’s are selected following scheme B of Sec. 6.5, the estimate of 
R is 


R = (1AOgWgQgi t 

(iMSiMi/pjxi X 


(6.42) 


The bias of R will be unimportant relative to its standard error if CV(Z) 
is small. As shown in the previous section, a first approximation to 
MSE(.R) would be given by 


MSE(B) = ~E(t - fil)» = ± V g - fix,)] 

Now, setting T x = Mi{y t - Rx x ) in Theorem 6.4, we have 


where S U i 2 is defined by (6.36). By (6.21) of Theorem 6.4, an estimator 
of MSE (R) is 


i “ 77 S(^~R 

X 2 n(» - 1) \ pi pi ) 


(6.44) 


Corollary 

If the object is to estimate the mean per subunit, an unbiased estimate of 
the total number of subunits is provided by (l/ri)SMi/p iy and hence the 
' mean per subunit is estimated by 




ft = (1 /n)S(Mj/pi)yi 
(1 /n)S(Mi/ P i) 


(6.45) 


This can be obtained from (6.42) by setting xa = 1, which gives 
Xi — Mi, X = 2 Mi, and R — population mean per subunit. 


SAMPLING Theory 

Thus a first approximation to the mean square error from (6 «, is 

MSE(£) = _ l _V 1 ] 

n(2il/,)2 L, p. ~ RMi) 2 

+ —i— wA 

where S 2 ‘ ■ L p im . y Mj** ( 6 - 46 ) 

wnere i S glVen b 

! ' j' " e6t,mate of M SE(S) is provided by 

S If <*■“*)]’ (6 47) 

^rTt TO —— 

population estimIt eStlmate 18 made for each strut 

"ill be obtainX 1 “7 p ™»pC ‘v 0 Ch buUd U P **>e final 

lf -®^ -es :t additi011 ° V » ‘hesrate ^ 

“ d the strata rati™ * “ e lar S* ®ough for th k 8 be ^equate 

oombmed-ratio estimate 7ill C h nSlderabIy from eachoth^ *° “^ble 

through for 8 eh “t ewil) be preferred T t, ° ther - Otherwise, 




fi y Theorem 5.1 the hi % 1* 

error when CV( 2 .£ \ • as of will be nee-li 'ki 

error of ft /u ls small a n A neg ^Sible relative + • 

‘ 'f ■“> * 1 

(sK-M,.-(iA P rv 

This ^ows that th V * 2 ' 4 (? ‘ - *^)1 

rwa* - y ■ v T* »» **tr “* «*-» 
kH 5 ^ r 

• V p», n + «x k ) 

wb «e + I^yi^ 7 . 


“■'ssr^K-r..,^ 


. hbu 1 

‘ifu, 


7,)] ! 


(6.50) 


SAMPLING and subsampling of clusters 

Further, the mean square error of R is estimated by 


1 V 1 ,Q r M hi(Vhi - Rx h j) 1 M hi {y hi - Rx hi ) 1 2 7/> 

p, ~n> s -^-J < 6 ' 51 > 


Note The reader is advised to work out Exercise 45 in which wtr simple 
random sampling is used at both stages. 


6.9 CHOICE OF SAMPLING AND SUBSAMPLING FRACTIONS 

The question of the number of psu’s to select in the sample and the 
number of subunits to subsample will now be discussed for the two-stage 
scheme of Sec. 6.5, in which n psu's are selected with replacement with 
given probabilities p, and a simple random sample of m,- subunits is taken 
from the M% if psu £/,■ is in the sample. By Theorem 6.4, we have 

? = 

n pmi 

Let Cl be the fixed cost per psu on travel, setting up an office, etc.; c 2 be 
the cost of listing a second-stage unit in a selected psu; and c 3 be the cost 
per second-stage unit of collecting information. Then the cost of the 
survey would be nci + c^SMi + c^Snti, its expected value being 


C — nci + nciLpiMi + nc^pirrii 


(6.52) 


In order to minimize the variance for a fixed expected budget we con- 
struct the function e ; 

^ " 1 “ uciLyiMi + nczEpjftn) 

The equations based on differentiation with respect to m,- (i = 1 2 
• • • , N) immediately give 


lUi oc 


MJ3 wi 


rrii = 


aMiS wi 


Since the p< are known, the only quantities to be determined are n and 

a a . Differentiating with respect to n and a', respectively, and 
quating to zero, we have 


ci + Cj’ZpiMj 

c 3 (V p - ZMSS^/pi) 



130 


SAMPLING THEORY 


This gives the value of a, from which we get 

m, - ( c i_+ M&, / v M&A-H 

V «» ) V< \ P ~ l~^~) ( 853 ) 

EquaS™ I he rf fUnCti0n (6 52) ’ the Value of n «“> be obtained 

different t k' 53 ! dS t0 f Very U3eful result - Prwi ded are not ve7v 
fferent, the optimum choice of m, consists in making it proportional to 

self m ,' VhlCh case the «t»>iator ? becomes self-weighting Thus the 
sdf-weightmg system is not only very convenient for making computations 

L“° trr; he r an ? «-* within p^ - 

n case depends directly on M if stratification by size (M<) would heln 
in making S* nearly constant within strata. P 

mtTvfve An0the t a f antage of the self-weighting estimator is that it 
. * 5 , a con f ant sam P le size from each sample psu. Thus, the 

different p°au’s “* ^P 0 ^ 16 for **»*»« ™rk loads in the 


Further reading When sampling is simple random at both stages 
Exercise 48 suggests that it is better to make the total sample size a 
random variable. 


6.9.1 CHOICE OF OPTIMUM PROBABILITIES 

In the previous section the probabilities p t - were supposed to be given. 
We shall now discuss the question of the determination of the best values 
of pi as well as the sampling and subsampling fractions. The analysis 
will be presented for the sampling scheme of Sec. 6.9 when a self-weighting 
estimator is used for estimating the population ratio R (Hansen and 
Hurwitz, 1949). By Sec. 6.8.2, 


-S — S^ 

£ = n Pi nti 

1 g Mi $ -a 

n pi mi 

t 

This will become self-weighting if m t - = kMi/pi. The expected cost will 
be given by C = nci + nc&piMi + nkc i 'LM i . From (6.43), 







SA 


M ptlNG 


aND SUBSAMPLING of clusters 


ill 


w bere Wtf 
Letting W 


= ya — Ui — Y; — RXt and S ui l is defined by (6.36). 
If = p' and nk = k', we have 

^ C = ndi + C 2 ^ PfMi + k'cz ^ Mi 

Thus the problem is to find n, k', and p' such that X 2 V(R) is a minimum 
for a given value of the expected budget and subject to the condition that 
v«' - n = 0 (or Sp»- = 1). For this purpose we construct 

G - X 2 V(R) + X(nc* + cz^PiMi + c 3 fc'2Mi - C) + /z(2p'. - n) 

Equating dG/dn and dG/dp'i to zero, we have 

Xci — n — 0 


M .2 

- 4- \ciMi + ix = 0 DJ = Ui 2 - 


Pi 

From these equations we get 

x P ; 2 = 


SJ 

Mi 


Mi 2 D ui 2 


or 


p< = 


ci + c 2 M { 

MiDui/ a/C i -f- ciMi 
2 MiDut/ V Ci 4“ CiMi 


(6.54) 


Now ^ DJ = 1 (? t - RX t y 
% 

*•-**•-*>* 


If the psu sizes M, are all equal to M, we find from (6.6) that 2/)«t* is 
proportional to the intra-psu correlation coefficient pm applied to the 
variate ya — Rxij. We are going to assume that pm is positive. It is 
generally found that pm tends to decrease with increasing M, since an 
increase in the size of the cluster brings about greater heterogeneity. 
Substituting for 2) u * 2 its average value, we have from (6.54) 

Mi VE(D^) 

" (ci + CjM,) w 

If E(D ui 2 ) is assumed to be constant, we find that p» « Mi if c^Mi, the 
cost of listing the psu, is negligibly small as compared with ci, the fixed 
cost per psu. If, however, Ci is negligibly small as compared with c^M », p% 
would be proportional to the square root of Mi. If we assume that 


132 


sampling theory 

E(DJ ) «_1 /Mi, the corresponding results in the two situat' 

Pi « y/Mi, and = a constant. Having determined the ^ are 
simple matter to put down expressions for the optimum values of’ ^ 18 a 

^ and 


6.10 SOME USEFUL MULTISTAGE DESIGNS 

A number of multistage designs which have become ™ 1 

to be increasingly used will now be described tT P f>U f 0r are lik ely 

surveys the population is stratified thoroughly to V ^ ° f natioi >al 

“ '“f * •» *ted from each stratum F ! P ° mt that onl y « 
sampled in, a manner depending on the tvn ' Sample psu is sub - 

aps.airphotographs, lists, etc) for subd^vr ° f , matenal available (like 

« t0 selection o" r,1? Tbus the diJsion 

quantity will denote an link; j f P U S from within strata Th* 
OP subsampling. ^ -timator of the rth £ 

6 - 10 'l RANDOMIZED SYSTfk/iat 

Ia ‘his procedure (Se SAMPL| NG OF PSU'S 

“• **■* ^ ^ u 

‘OW measur"ot ‘v- If n P™W to h™”? * ° f St ° res at the !ast 

The psu> s i n lze a random numhe • * T ec * ed ^ rom a stratum with 
*' a ““ afi o th J 6 Sampla are Cse “n wb “ t#k ® between p and k = X/n. 
oossively to i j umb «ts i + k,i + 2 . hose ra “8e he the random number 

ls exactly pron or t-; 1S °*? vious that with tv ' °^ taine( i by adding k sue- 
I^nce, and a„ °“ al to *<• The 1 ® Pr ° Cedure »* - nX</X, which 
Theor em 6.3 oatimate of variance ° f the Potion total, its 
m order to maJf ® ° ne Point to be JT b ® ° btained with the help of 
, P 811 ’ 8 in the str»t a “ Ullbias ed estim-,/' 11 ?' 1 ®' 1 ber ® is the calculation of 
Caa b ecl, U , miS sma di £?« tb ® va "ance. If the number 
au ? a oolating T . , a ed by listing all 0r ., b| T<i ^ or the two units in the 

of n, ^ ° ut °f thl i* ® acb arr angemp P ? S m b *® arra ngements of the psu’ 8 

sas tit 

^4^£bvE bas ® d o e n y e a xl Ra ! (1962 ) ^®< 


k at «®r>unih RaTlJ M 

#le *eure of^^'tion win ^ Uat b e used „ , 

®' The est- b ® "rade witb’ n y one P su P er stri 
tlmata and2 hprobab ilityproporti< 

e variance will be given 







sampling AND subsamplinq of clusters 

133 

(6.19) and (6.20) with n = 1. A conservative estimate nf th* 
can be made h * 8 rou P ln S the strata in pairs and calculating 

.An TflV 

\Pii Pa) 


y 


from the jth pair. The expected value of A, would be 

E(A,) = y^ + yZ* + (F r , s 

80 “ 2 uV Wi11 Tfjfu the true Variance 2 V (T</p<). The overstate¬ 
ment could be controlled by pairing those strata (in advance of the survev) 

which have approximately the same total for the character under study 

This design is being used by the United States Bureau of the Census for 

collecting data on employment and unemployment. 

6.10.3 ONE PSO PER RANDOMIZED SUBSTRATUM 

In this method the stratum is divided up into n substrata by allocation of 
the psu s at random t„ them. (As will be seen, the number of psu’s per 
substratum should be about the same.) One psu is selected with pps 
from each substratum. This gives a sample of n psu’s selected without 
replacement with pps. This procedure is due to Ran, Hartley, and 
Cochran (1962). Suppose the stratum is divided up into n substrata 
con aimng , (i ,...,«) psu’s. Let X, be the measure of size of 
the ith substratum (X,- is a random variable). Then the estimator of the 
stratum total Y } in single-stage sampling is 

? - f x, ^' 

1-1 *0 

For a given subdivision of the stratum into substrata, we have 

e(xM = E ’^ = Yt 

\ x w p ti 

that E(t) = 2 Y> = Y. Also F,E S (1>) = 0. 

i 


And 


4 4 Xi \xvZXi Y ') 

= y y xaxa/ ya _ y ik y 

1 X* \%ij/X% Xik/Xi) 

= 1 1 **** (£ ~ S 


by (3.23) 


1M 


S *“ PUN « tm 6 

. , mty that two specified units belong to the ith sub 8t 

Now the probability tn Hence at„ 

is (N</N)K n ‘ '• 1)/(W ’ 



EiV,( f) 


K 


V y ^ 

* U N N - 1 & 


(y± _ y >V 

\*X xj 




or 


. tst - N v _ Yk? - rY 
v(f) = iv(w-i)r ''vp- ' 


Thie shows that for best results the M should be equal, In that case 

v ( f) = (i - jrr)H Vi ^r Y ) = ( 1 ' 

where F PP . stands for the variance when sampling is with pps with 


PPI 


replacement. T - be an unbiased estimator of the ith 

In the multistage case let 2V ,be a an ^ 

psu total in the ith substratum with V<J 
estimator of the population total will be (Raj, 1966«) 


ri Tij 

= y Xi—■ 

y *« 


and its variance will be given by 




, a vn f v W-WVV 

■low AT(1V-1) " 

** *«/ l i J 

*7 /xr _ \ r» ' 


..2 




N(N - 1) 4 4 Xij/Xi 




ince in random samples of size AT, from a population of sizethe 
xpected value of SxM*<) .from first principles, given 
l/N(N - 1 )][Ni(N - Ni)2yi + Ni(Ni - 1 )X'L{y i /x % )]. 


. 2 N*-Nx ( Yi v\ 

[ence V(t) - ^ 4 P* (p. ) 


* N 2 - 2 NS 

+ 


_ n 4 


N(N - 1) 

2Ni* 

+ N(N - 1) " * 






SAMPLING and subsampling of clusters 


135 


1 


Further reading For a method 
Exercise 16(a) and Exercise 54. 
variance is presented in Sec. 9.9. 


of estimating the variance, refer to 
general method of estimating the 


6.10.4 PSU'S SELECTED WITH PPS OF REMAINDER 

In this method the first psu in the sample is selected with pps and the 
second with pps of the remaining psu’s, and so on (Sec 3 24) Con¬ 
sidering the practical situation of n = 2 psu's per stratum, the following 
are unbiased estimators of the stratum total (Raj, 1956, 19665) 

Zl = n Zt ~ r, + ^ (1 -Pl) z = + %*) 

To find the variance of Z i} we have 

— = F(f.) 

Pi 

where the are defined in Sec. 3.24. 


% 

/ 


-IS 


Hence 


v(z,) = v(h )+y 

L* pi 

Similarly, V(Z,) = V(t 2 ) + E L> + — (1 - Pl ) ! l 

l Pi 2 J 

But 

E ['*’ + $0 -rt*] = *[t* + X^d- p.)] 

= pi [<n ! + (i - pi) ^ 

+ P2 |*2 2 + (1 - p 2 ) ^ — | + 

- ifa - i+a=£ p ( 

Hence F(Z.) - F (4) + £ (2* - 1 + <r<* 


Also 


Cov (Zi,Zi) = 2v»* 



r 


136 

Hence 


SAMPLING theory 


+ 2 X <r<1 ] "if 1 - - + p ± ) ?£' /jj _ w 

' 2 / 2 \Pi P// 


+ 


j2( 1 + 2 Pi -i_2)„, 


Note the correction factor 1 - (»■ 4 . «wo 

tween-psu contribution to th a xror-i ^ P, « ’ whlch brm S s down the be- 
found to be nance. An unbiased estimator of V(Z) is 


4 \Pl P2 ) 1 + -** 


±Si + L^p. . 

T* P ‘ P= 

that the eSt ‘ mat0r Z beCOme8 self-weighting of the 
Mi = 2 Pl m 2 _ 2p 2 

Wi *.(1 + pi) m 2 &(1 - Pl ) 

when second-stage units are selected at random from the M { second- 
stage units in the zth psu in a two-stage sample. 

temar* It can be proved (Exercise 51) that V(Z 2 ) is smaller than 

v K^i). 


Further reading 

1. See Exercise 50 for an alternative estimator. 

2. For an alternative but equivalent sampling scheme, refer to Exercise 

44 and Exercises 18 and 19. , 

3 ' , For anot her sampling scheme see Exercise 52 where the sample of 
psu s is selected with probability proportionate to its aggregate measure 


REFERENCES 

mum ^ ? urw ^ z (1949). On the determination of the opt 

^improbabilities m sampling. Ann. Math. Slot., 20. 

and Theory^ ^ Ia ^ ow (1953). “Sample Survey Metho< 

Hartley h’ *** S ° ns ’ Inc > New York. 

without replacement! Ann.' 3 g Sampling with ***&& probabiUti 




SAMPLING and subsampling of clusters 

* . 137 
4 Mahalanobis, P. C. (1940). A sample survey of tv,* Q 
' Bengal. Sankhya, 4. y acrea ge under jute in 

5. (1954) ' ^ SamPling Wi ‘ h ^ Pr0babilities multistage designs. 

^r-, S Z ltTlyr liDt WHh — “tos Without 

- (1964). The use of systematic sampling with probability proportionate 

to size in a large scale survey. J. Am. Stat. Assoc., 59. 

- (1966a). Some remarks on a simple procedure of sampling without 

replacement. J. Am. Stat. Assoc., 61. 

(19666). On a method of sampling with unequal probabilities. Ganita 
17. 

Rao, J. N. K., H. 0. Hartley, and W. G. Cochran (1962). A simple procedure 
of unequal probability sampling without replacement. J. Roy. Stat. Soc., B 24. 


I 




CHAPTER seven 


pOUBLE-SAMPLING PROCEDURES 

and repetitive surveys 


7.1 INTRODUCTION 

Numerous examples have been given in the previous chapters to show 
how the available auxiliary information could be used to achieve greater 
precision However if auxiliary information is not available but can 
be co lected rather inexpensively on a somewhat large scale, it may pay 
to collect such information in the first instance and then take a sample for 
e measurement of y, the character under study. As an example, sup¬ 
pose it is considered desirable to select a sample of agricultural holdings 
with probability proportionate to area, but information on area is not 
v , ma y then decide to take an initial random sample of 

° dings and collect information on their areas (say, by asking the holders) 
^ hen take a subsample of holdings with probability proportionate to 
T ^ a an( * collect information on the characters under .study from this 
sec Sa ^^ e ' doubt, within the allowable budget, the size of the 
0 £ on d sample will be reduced from that originally planned, but the use 
areas in selecting the second sample may more than compensate for 


140 SAMPUN6 THEo* Y 

the reduction in sample size. There are several ways of using the initial ' 
sample. It may be employed for introducing a desirable stratification or 1 
for making a good estimate of X for purposes of ratio, regression, or 
difference estimation. The main point of departure from previous theory 
is that the sample is now taken in two phases—first an initial sample and 
then a second sample. That is why this procedure is called double 
sampling or two-phase sampling. Several examples of the application 
of this technique in sampling work will now be presented. 


7.2 DOUBLE SAMPLING FOR DIFFERENCE ESTIMATION 

In order to estimate the population mean for y, it may be considered 
important to use the method of differences although information on x 
is not available. Then an initial random sample of size n' is selected with¬ 
out replacement and information on z is collected. The second sample 
is a subsample of size n taken without replacement from the first sample 
and y is measured on it. Let k denote a good guess of the ratio of y to * 

? (Eaj P °i965a) 0n ' f ° llowing estimator is used for estimating 


= y — kx + kx' ( 7 , 1 ) 

sam P'e means for the subsample and *' is the mean of 
m the initial sample. Given the initial sample, 

Ei(y ~ kx + kx ') = y' 

variance of fi, we have ^ eStlmator used is unbiased for f. For the 


Hen 


ce 


v ' E *w = vm = 1 1 1 _ *\ s i 

n'\ NJ v 

- ( 1 - 0l\ V (Vi - kXi - y’ + kx'Y 
71 ' n / 7 n' - 1 

= - (f ^ V (Vi - kxi\— Y + kxy 

n ^' n> / 7 ~ iv^T 

»(* ~~ n) Wr* + ^ - 2 k P S x S v ) 


(7.2) 


(7.3) 


^00 


If 


C and c denote the 


C #) Sy2 ~ 

' n »'/ 

vf* il * 


k$x($pSy - kSx) 


(7.4) 


Unit C0sts of collecting information on * a ° d 


V> 





DESAMPLING PROCEDURE AND REPETITIVE SU R VEys 


’ respectively (e> will usually be much 8maller than . ju 

ible-samphng procedure would be ™ c) > the total 


doul 


141 

c °st of the 


C == c'n' -f. 


. cn 

If a straight random sample is taken (without • (7 ’ 5) 

procedure) for y, the sample size for the same co^ g w t i 1 e b d ° Uble ' samplin g 


= cnj-^n 

: = n + ~ 

c 0 


c'n' 


and the variance of the sample mean will be 

(i -1) 'V 


2p — h > 1 


/[* (' - 5 ) 0 + 5 )] 


(7.6; 


' ' / \ ^ / J 

As an example, let k be the regression coefficient P S V /S X , h = P . Further 

relation h A^° ^ T^k Z t* Then the condition is that the cor 
relation between y and x be higher than 0.47. 


itemarfe By finding unbiased estimates of the expressions in (7.2) and 

(7.3) we can easily find an unbiased estimator of the variance [given by 
Eq. (7.4)] to be 




Sd‘ 


(7.7) 


where 


1 71 1 n 
= n __ i ^ (y* ~ y ) 2 = n _ i S[yt — a — k(xi — x)p 


Further reading Say that information is collected on not just one but 
several x-variates in the initial sample. Then Exercise 58 shows how 
this information may be used for achieving higher precision. 


( 7,2<1 INDEPENDENT samples 

^he case in which the second sample is taken independently of the initial 
} ar ge sample will now be considered. This is done when, for instance, 
information on x is available with one agency and information on both 
y and x on a small independent sample has been collected by another 
agency. It i s possible to make use of the information collected by both 



142 

SAMPLING THEORY 

agencies for improving the estimate of the mean of « The e»tfm.. 
given in (7.1) will again be unbiased. This is so because 

E(y - kx) = Y - kX and E(kx') = kX 

bv the* sumTf r PleS . are tak f‘ mde P endent 'y, the variance of A is given 
by the sum of the variances of y - kx and kx'. Thus 

m = (» “ i) W + k,S -‘ - 2WA) + » (i - I) *, (7. 8) 

By the same argument, an unbiased estimator of ^variance would be 

„ t" n) 0 ~ |) 

where s x 2 = J - x') 2 /(n' - i). 


7.3 DOUBLE SAMPLING FOR PPS ESTIMATION 

but in f WhiCh H iS COnsidered 
available. This information is then collected frlmTn'in,™!V' T 
(simple random) of size W from which a suhs»mnl t * al ““f 1 ' 
with replacement with pp to x Then P ® of S12e n ls selected 

(Raj, 1964). PP Then We P rove the following theorem 

Theorem 7.1 

S? st l Sr 'SL^tr * 4 


with 


t-ZZs* *> = s x . 

n n , Xi ~ *• 


(7.9) 


+ «I"? - 7 5Tii [(«£)■ - «g]| 


(7.10) 


(7.11) 




pOUBLE'SAMPI-lNG PROCEDURES AND REPETITIVE SURVEYS M3 

4 pboof Given the initial sample, E[(x'/n)S(y t /x,)} = y' so that 
^ «= E(Nv') = Y ‘ Regarding the variance of f, 

£*(?) = Iff V,E 2 ( f) - a r« (i - i) 5,* 

since the probability of a specified pair of units being selected in the 
sample is n'(n' — 1)/N(N — 1). 


Hence 




This proves (7.10). In order to get an unbiased estimator of the variance, 

n n' 

we note that given the first sample, (1 /n) S O/.Vp*) estimates S y % 2 , 

i j 

and S'(yi/pi)(yj/pj) /© estimates (s yt y, from which we get that 

»' / ^ 

S (y* — y) V in' — 1) is estimated by 
i 

Is— - 2 S'-- 

n pi n'njn — 1) pipj 
n' - 1 

But EiS (y* — y) 2 /(n' — 1) = S v 2 . Hence ViE 2 (?) can be estimated 
i 

from the sample. Noting that 

Erin- 1 ** - s (*’ --- s -Y 

£,F,(y) = »'* n(n - 1) V xt n xj 
the proof of (7.11) is complete. ■ 

Corollary 

For the cost function considered in Sec. 7.2, double sampling for pps 
Estimation will be superior to one-sample simple random sampling if 

VM < 7 N( N - W 

n — 1 Wo 


(7.12) 


144 SAMPLING THEORY 

It may be noted that V p {y) is the variance of ? based on a p Ds 
sample of size unity, while N(N - 1)S V * is a similar quantity based on 

a simple random sample. And (n f — no)/(n' — 1) < l,n/n 0 < 1, Thug 
inequality (7.12) will be satisfied when a pps sample is far better than 
a simple random sample. 


simple 

Further reading See Exercise 59 for a generalization to multist 
sampling. 


ge 


7.3.1 THE CASE OF INDEPENDENT SAMPLES 

In tbs ease the first sample is used solely for estimating X An i„d 
pendent sample of size n is selected with dds usimr tl>o j , . 

in Sec. 3.8 [due to Lahiri (1951)1 in which i. ■ ^ J e ^' ,re described 

V )1 ' m whl<!l1 ‘t is not necessary to know X. 

Theorem 7.2 

sdected ppl P ^’ S mph ravdom and 0* second sample is indepmdmMy 

l -s y A „ 131 

F (?) = ('I 1\ 7 ' Vn X J (7.13) 

W N ) N'R'SS + I V ^( y) [l + (I _ I\ 

relating t0 thetarianw"?^ ^we ^ IX = «, we have E( ?) = T. 
We have ^ 1 7 t^Jc ** ^ ^ 

V (X) = AT2 fi I \ random variables. 

F *(») + « W2 /1 i\ 

W/nJi/1 i\ 

V ~ N) N ' R * S - 

* »”■« [■♦ (j- s 


1 1 

X 2 n 






double-sampling procedures and repetitive surveys 
Again, an unbiased estimator of V(jt) i s 


= Hi' b) * 

and an unbiased estimator of F($) i 8 


s , 2 = 


1 


^7 Sfa - try 


**> - -»)’ 
Hence, by the same result (Sec. 1.5), we have 

7(f) a PV&) + &V(1) - V(R) V(X) 

which is the same as the expression given in (7.15). 


145 


7.4 DOUBLE SAMPLING WITH PPS SELECTION 

In the situations considered so far the initial sample was always taken 
with equal probabilities in the absence of any auxiliary information. In 
case information on a character z is available, we may select the initial 
sample A x with probabilities p, (i = 1, . . . , N) proportionate to z and 
collect information on x. The second sample is a subsample of A x , 
selected with equal probabilities without replacement in which informa¬ 
tion on y is collected. We then prove (Raj, 19655) the following theorem. 


Theorem 7.3 

When the initial sample is selected with pp to z and the second sample is a 
simple random subsample, 


nn 



is*_*£*+£is 

n j pi n j n i P' 


(7.16) 


nt) = i r,(v) + (J - i) 

. 1 _s(Z--a *Y 

n'(n' - 1) j \P* n VJ 


1 1 


[k 2 V p (x) - 2 kb<r p (y)o p (x)] 


\ 1 



(7.17) 


n 


(7.18) 



146 


where d { = Vi - fa. 


sampling theory 


(7.19) 


V,M. £,(?-*)■ 
r-7—^- - p ' L (y x\ 

[z*<Z- r yi*k-*yT~’*i> 


proof Given Au E 2 (t) = if / \ 

\ ) \i/n ) o (yi/pi) so that E(f) = Y, and 

ViEi(T) = I y M . 
where V,(y) = £ * g _ K )’. Now 

since we take a random sample of n from n\ And 

w,<f) " G - j) 1»[*7^ - o -»»]' 

“ (n “ it") ! r ’ (i,) + ~ 

Th« proves the expression for the variance of f. Regarding the Question 
of makmg an unbiased estimate of variance, we note that 

b-L. s(*_l s vA'_ 

n - 1 7 \P< » V J ~ 


and 


*-^C"s s 3 


Hence the estimator giver. in (n i Q \ • ,. , - 

given in (7.18) is unbiased for 7(f). I 

Corollary 

the double-salp^'nrifj?,'* is po , ssible to find the condition in which 

6 W0U ( * be preferable to taking a single 





147 


DOUBLE-SAMPLING PROCEDURES AND REPETITIVE surveys 

arople of size ». With pp to *. Denoting fo,(x)/„,( y) by h, the condition 

28 — h > 


s 

is 


t*c—f)o+sir 


(7.20) 


7.4.1 INDEPENDENT SAMPLES 


If the two samples are selected independently, both with pp to c it is 
fairly simple to see that an unbiased estimator of the population total Y 
is given by 



(7.21) 


where the terms in the bracket come from the second sample. The 
variance of Y now is 


1 


V{t) - ^ [V p (y) + k 2 V P (x) - 2kSa p (y)a,(x)] + - f V v (x) 

TV 

and an unbiased estimator will be found to be 


(7.22) 


n' n' - 1 \ Pi n p. 


) 


+ 


n(n 


(723) 


7.5 DOUBLE SAMPLING FOR UNBIASED RATIO ESTIMATION 

In most of the applications of the double-sampling technique presented 
so far, the method of difference estimation has been used and pps sampling 
carried out with replacement. We will now illustrate the use of the 
ratio method. In order to get unbiased estimates the technique of 
sampling with pp to aggregate size will be used. The initial sample is 
simple random and the second sample is taken from it with pp to aggregate 
x , a variate measured in the first sample. The proposed estimator for 


t = W-x' 


X 


(7.24) 


where y , x, and x! are defined in Sec. 7.2. Given the initial sample, 


V _ V' 
x x 


so that E(f) = Y 



SAMPLINq 


theory 


The expectation of (yf'/2)* can be calculated from first principles a 
(Raj, 19546) * 

n 

where Y denotes summation over all possible samples of size n from an 

initial sample of size n' and £ denotes summation over all possible 
samples of size n' from the population of size N. Hence 


(«> 

To get an unbiased estimator of the variance we note that 

from which we estimate Y* = £ Y£ 2 £ y\y\ as 

, G — AW + 2 (7.26) 

V&c. n - 1 Sxi ) 

Hence an unbiased estimator of F(?) is given by 

V(f) £ (P) 2 - G (7.27) 

See Exercise 60 for an unbiased ratio-type estimator. 


(7.26) 


(7.27) 


I a * 11 8AI *' ,JNe F ° R B,ASED RATIO ESTIMATION 

simple random s^mpir^^a! ™‘?L PP to a «« r egate x, one could take » 

estimator ® eCi 7.2) and use the familiar biased ra 


J0- * l 


x 


(7.28) 


‘ppropriate to this scheme. By app lying Xheorem , ^ we have 

® (f *') - ? - * , coy (t, a,*-) 




SAMPLING procedures and repetitive surveys 


149 


double- 

which gives * expression for the bias of Hi. In case the two samples 

are independent, the bias m M is the same as that in the usual ratio estima¬ 
te, applicable to single-phase sampling. With the help of Theorem 5.3, 
a first approximation to the mean square error of Ht will be obtained as 

MSECS?) = i E(yx' - Yx) 2 = ~ V(yx' - Yx) 

Now, applying theresult (Sec. 1.5), which gives the variance of the product 
0 f two independent random variables, we have 


V(yx') 


-*(H) 


S u 2 + ? 2 




+ G - i) G - b) 


And V(Yx) - S, 2 Cov (yx',x) = X pS„S, 

Thus 

MSEC*) * (J - £) W - 2BAA, + iW) + (i - I) firs,* 


\n Nj\n' N) v X 2 


<S , 2 


(7.30) 


Neglecting the last term on the right-hand side of (7.30), which would be 
quite small, we obtain an approximate expression for MSE(M) which 
agrees with Cochran (1963). An exact expression for the variance of the 
estimator can be obtained by treating y/x and s' as two independent 
random variables and using Eq. (1.18). When the second sample is a 
subsample of the first, the random variables y/x and *' are dependent 
in this case the result given in Exercise 84 applies. It will be found that 
an approximate expression for the mean square error is 

MSE(jfr) = Q - jjj (S v > - 2 R P S y S x + R*SS) 


+ 


(n' “ ( 2RpS » S * ~ R * s * 2 ) (7.3i; 


7 -6.1 COMPARISON WITH THE DIFFERENCE ESTIMATOR 

estim m ? anS ° n ° an n ° W be made between d °uble sampling for difference 
are _. a 1( J n an d. double sampling for ratio estimation when both samples 
lmp le ra &dom. It will be found from (7.4), (7.8), (7.30), and (7.31) 



150 


SAMPLING THEORY 


that the exact variance of the difference estimator is the same as the first 
(large-sample) approximation to the variance of the ratio estimator 
provided that k of the difference estimator equals R = Y/X. 

In case the initial sample is selected with replacement with pps and 
the subsample is selected with equal probabilities without replacement 
the variance of the difference estimator is given by (7.17). On the other 
hand, the variance of the biased ratio estimator when both samples are 
Ton 6 r ^ 1 d0I f 18 glVen a PP r °ximately by N 2 times the expression in 
1. _ p j u tW ° vanances can be compared on the assumption that 
7, K and that the sampling fractions are small. It will be found that 
the superiority of double sampling with pps selection to double sampling 
for biased ratio estimation lies in the selection of the first sample with 

unequal probabilities. The same remarks apply whether or not the two 
samples are independent. 


7.7 DOUBLE SAMPLING FOR REGRESSION ESTIMATION 

With the selection procedure of Sec. 7.6 we may use the regression estima¬ 
tor in place of the ratio estimator. The first sample is a random sample 
of size u from which a subsample of size n = n'\ is taken for measuring y. 
Based on this subsample the regression coefficient b is calculated, as well 
as y and x. The double-sampling regression estimator of Y is then 


ifr = y — b(x — x') (7.32) 

where x' is the mean of the initial sample. We shall obtain a large- 
sample approximation for the variance of this estimator. It would be 
convenient to write x! as Xx + yx", where y = 1 — X and x" is the mean 
of the n'y units in the first sample not common with the second sample. 
Then id = y — yb(x — x"). If n is large, the distribution of id will be 
the same as that of (Sec. 5.14) 


y - nB(x - x") B = P S V /S X 
Hence the large-sample variance of til is given by 

V(&) = ¥ + (4r + -r) <S. ! - 2j«B 

n'X \n X n'y) n' 


A 

x 


= ^ [(1 - p') + Xp ! l = 

71 X 


S»*( 1 - P 2 ) , P*S 


n 


+ 


n 


iSf 2 
n 


p 2 S l 


(>-l) 


n 


(7.33) 






DOUBLE-SAMPLING PROCEDURES AND REPETITIVE SURVEYS ^ 

This expression for the variance shows that the double-sampling method 
would be better than taking a direct sample of n for y. Using the simple 
cost function given by (7.5), it is possible to find the best value of n/n' by 
calculus methods. It will be found that n/n' = [(1 - p*)/p\c' /c))* 
and for this subsampling rate the minimum variance is 

xr _ s Ap V? + V7l - /o 2 )cl 2 

* min — - 

C 


Thus the condition under which the double-sampling method is better 
than taking a direct sample for y, for the same cost, is 


P 2 > 


4cc' 

(c + c'Y 


(7.34) 


7.8 DOUBLE SAMPLING FOR STRATIFICATION 

If it is considered desirable to introduce a stratification with respect to 
x, an initial sample is taken to collect information on x. On the basis 
of this the sample units are allocated to the L strata desired to be made. 
If the initial sample is simple random, the number of units n' h falling in 
the Ath stratum would be a random variable. From n h a simple random 
subsample of size nh is taken to collect information on y. It is obvious 
that the n h units in the ht h stratum form a simple random sample of the 
N h (unknown) in the stratum. Denoting by y h the sample mean for y, 
we have 

E(%) = Y h F(yi) = (i - i-) 

\n h NJ 

Denoting n' h /n' by a h and using the method developed in Sec. 1.8, 
it is easy to establish that 

E(a h ) = = W h V(a h ) = 6fF„(l - W\) 

6 " iTTi Cov M - ~ bW ^ 

As an estimator of the population mean we take 

id = 2 a h yh ( 7 . 35 ) 

Since E'Zahyu = E{Za^Yh = 2TF/,F*, the estimator is unbiased. To find 



to variance, we have _ ZaSV(y h ) 

EiVi(l&) “ ZFWW1 - 1P10 + Wyj 

- 2o»?» 

kacjB') - - wv -»2 2 r»f.jr»ir t 

A A?^A 

= &zw*iv - acznviV + ss^nir*^) 

= bzw h n* - Hzw h ? h y = bxw h (? h - f )2 

Hence 

+ 6 2 w *&> - ?) ! ( 7 , 36 ) 

Remark If information on x is already available for the entire popul a , 
tion, the variance obtained from a stratified random sample would be the 
first term in (7.36). The other terms in V(M) represent the price to be 
paid when stratification has to be introduced on the basis of a preliminary 
sample. In the latter case the strata weights will have to be estimated 
from the sample. 

Remark For large N, b is approximately 1/n'. For n h Wh, the vari¬ 
ance of the double-sampling procedure is approximately given by 

For the cost function of Sec. 7.2, the variance from the single-sampling 
procedure would be approximately given by 

w*> + 1 -y w h (? h - yy 

n o w no L< ' 

Thus the between-strata contribution to the variance would be con¬ 
siderably smaller with the double-sampling procedure. 

Remark The estimator of the variance is obtained in Exercise 57. 


7.9 REPETITIVE SURVEYS 

The discussion presented so far in this book relates to what may be called 
ne-time surveys. When data on some items have been collected on ft 


POUBLE-SAMPLING procedures and REPETITIVE surveys 


199 


population of AT units, the matter ends there. But many surveys these 
Jays are repetitive in character. Most governments collect inJmaS 
regularly on the same population to find out, say, the number unemploy^ 
their characteristics, and so on. Such surveys present certain novel 
features; these features will form the subject matter of the remainder of 
this chapter. The reason for discussing repetitive surveys is that they 
have certain similarities to double-sampling procedures. Say that a first 
sample has been taken (on one occasion) and a second sample is to be 
taken (on another occasion). There is thus an opportunity of making 
use of the information contained in the first sample. The problem is how 
best to learn from past experience and use it for improving the precision 
of future estimates. Actually earlier estimates too can be revised in 
the light of new experience. Estimates can be made not only for the 
existing time period {current estimates ) but also of the change that has 
taken place since the previous occasion (,estimates of change) and of the 
average over a given period {estimates of mm). An interesting question 
to consider is: Should the same sample be used every time, or a completely 
new sample, or a mixture of the old .and the new? Although the answer 
does not depend solely on the variances involved, we shall make a begin¬ 
ning by illustrating how the sample on the first occasion could possibly be 
used to form estimates on the socond occasion. 


7.9.1 SAMPLING OVER TWO OCCASIONS 

A population is sampled over two occasions for making current estimates 
of a character, say, unemployment. On the first occasion a simple 
random sample of n units is taken. A random subsample of m = nX 
units is retained (matched) for use on a second occasion, on which another 
independent random sample of u n - m = nn units is selected 
(unmatched with the first occasion). For simplicity we shall denote by 
y and 3 the measurements on the second and first occasions, respectively. 
The finite population corrections will be neglected, and the variate will 
be assumed to have the same variance S 2 on each occasion. The mean 
on occasion h will be denoted by M h . The mean on the first occasion 

will be estimated by (1/n) S x t . For estimating Af 2 , two independent 

1 

estimates can be made. One is i0 r 2 u = (1 /u) S y.-, which is based on the 
unmatched part, and the other is the difference estimator 

\m 1 m j / n j 


(7.37) 



154 


sampling theory 


which is based on the matched part. Given the sample selected on the 
first occasion 


= - Syi - - Sxi + - Sxi = - Sy { 
n n n n 


Hence 




- 6 «*)- 


M, 


which shows that Mi m is an unbiased estimator of M%. The two estima¬ 
tors i# 2 u and M 2 m could be weighted inversely to their variances if we 
wished to find an improved estimator of Mi- We have 


S 2 

v(ti x) = - 

n 


V(^2u) = ~ = ~ 
u n\i 


1 S 2 

£,(* 2 *) = - Sy, = - 

n n 


Victim) ——~ S [y< - Xi - i S(yi - £»)1 

\Bi- fl/n-ljL n J 


EiViitiim) = (- - -) TT^-T T [(y,- - Ml) - (Xi - Mi)Y 

\m n/ N — 1 


= (, s 2 + aS 2 - 2p$ 2 ) = 2(1 SK 1 - p) 

\ra n/ 


nX 


Hence 


V(tilm) 


S 2 


TlX 


[1 + (1 - X)(l - 2p)] 


(7.38) 


The estimates and their variances on the second occasion may thus be 
exhibited as below: 


Estimate 

Unmatched part M 2 „ 


Variance 


S 2 


= 1 /W 


2« 


Matched part 




ny 
S 2 


2m 


^ [1 + (1 - X)(l - 2p)l = l l W 
nk 






By weighting the two estimates inversely to their variances, we have 

Wluliil V -f- Wl m l(ll m (7-39) 


Mi = 


W 2u + Wlm 


V(M,) = (W,» + F 2.)- 1 = f [1 + (1 - 2p)p][l + (1 - 2p)/l-‘ < 7 ' 40) 

/l 






qOU 0 lE ‘ SAMPLING PR0CEDURES AND REPETITIVE surveys 155 

4 , order to determine the best value of p [which minimizes r*)] f we 
differentiate V(M 2 ) with respect to p and equate it to zero. This gives 

_ 1 x V2 Vl - p 

" = rTvwn x -i w2Vw (7 - 41) 

The minimum variance is found to be 

= £ (1 + VS VW) - 5 g + (7.42) 

If a completely independent sample is taken on the second occasion, .the 
estimate will be $2 = (1 /n)Syi, with a variance of S 2 /n, which is greater 
than Fmin for p > K • If the same sample is taken on the second occasion, 
the variance will again be~S 2 /n. Thus, for making current estimates 
(using the difference estimator) the best, policy is to replace the sample 
partially. 

The optimum percent to match and the relative gain in precision 
compared with no matching are given in Table 7.1 for different values of p. 
It is found that with this estimator no more than 50 percent should be 
matched and that this percentage decreases as p increases. For low 
values of p the gain in precision is quite small. 


Table 7.1 Comparison of matched and unmatched samples 


p 

Optimum % 
to match 

% gain in precision 
relative to no matching 

0.5 

50 

0 

0.6 

47 

6 

0.7 

44 

13 

0.8 

39 

23 

0.9 

31 

38 

0.95 

24 

52 

1.00 

0 

100 


Further reading Suppose a sample A\ of n clusters is selected on the first 
occasion with probabilities proportionate to size. On the second occasion 
) a simple random sample of m clusters is selected without replacement from 
and an independent sample of n - m clusters is selected in the same 
banner as A\. Following the procedure of Sec. 7.9.1 it is possible to 
I ?*ke use of both samples (Raj, 19656) for obtaining an estimate of Y on 
^ e second occasion. 


samplinq 


theory 


w MINIMUM-VARIANCE CURRENT ESTIMATES 

Suppose it is desired to know what is the mnimum-variance unbiased 
linear estimator of M , and the associated optimum fraction to be replaced 
on the second occasion. This question can be answered if the populate 
is assumed infinite, and if minimum-variance estimators are understood 
in the sense of general estimation theory. The following notation is due 
to Yates (1949) and Patterson (1950). 

2d occasion / y' / y" /_ 

/ x" / x' / 1st occasion 

A single prime indicates the units common to the two occasions and 
double prime indicates the units selected independently. The best 
linear estimator sought will be of the form 

= a{x" - x') + cy’ + (1 - c)y" (7.43) 


in order that this be unbiased. If it is a minimum-variance (MV) 
unbiased estimator, it must be uncorrelated with every zero function 
[Theorem 1.10 due to Rao (1952)]. Hence it is uncorrelated with 

y' — y" as well as with x' — x". Thus 

(1) Cov iy'M*) = Cov iy"Mt) 

(2) Cov {x'Mi) = Cov (x"Mi) 

Noting that Cov iy',x”) — Cov iy',y") = 0, Cov (y',%') - 

tr* 

Cov ( y',y ') = — Cov (y",x") = Cov (y”,x') * 0 Cov (y",9") * ^ 
n\ 


we have from (1) 


Similarly (2) gives 


, * 2 /t \ ^ 


<r z , a* 

—a — -p cp — = d 
wX nX nn 


Solving the two equations for a and c we get 

X/up X 


a ~ 


I — pV 2 

Hence the best linear estimator of M 


c = 


1 - pV 

2 is given by 


(7.44) 


V 


* r- p* M * ~ *') + w + mC 1 " p2 ^ ^ 


(7.45) 


In order to find the variance of jfr 2 we use the Corollary 


to 


Theorem 



pOUBtf-SAMPUNG PROCEDURES AND REPETITIVE SURVEys 

1.10, by which the variance of Jfr, „ , 

9B y unbiased estimator of M 2 . Thus “««* covariance 


157 

between 1 ft, aB( j 


= Cov = ^ 

* P V W 


By differentiating F(*,) with respect to „ aid ^ ^ 

values of n and X are found to be gating to zero, the best 



M = 


1 


1 + (1 - p 2 )* X = l 



+ (1 - p 2 )* 4 


(7.47) 


v ml „(Xt 2 ) = - L±^w /*» i 

» 2 ~ B ^ (7.48) 

Remar/! It can be verified that the use of the difference estimator 

= (y’ — pi') + p(Xi' + pf") (7.491 

in place of (7.37) in Sec. 7.9.1 will lead to the minimum-variance estimator 
Mi given by (7.45). A value for p will have to be substituted on the 
basis of past experience. This will retain the unbiased character of the 
estimator although the variance will increase. 


7.9.3 ESTIMATION OF CHANGE 

As in the case of M 2 , the best linear unbiased estimate of Mi will be of the 
form 

b(y" - y f ) + dx' + (1 - d)x" (7.50) 

In order to be a MV estimate, it should have zero correlation with 
x' - x” as well as with y’ - y". By following the steps indicated in 

Sec. 7.9.2, Mi will be found as 


Mi = 


1 


[ p \n(y" - y') + Xx' + p(1 “ pV)x"1 (7- 51 ) 


1 — pV 

It follows that the best estimate of A = M 2 - Mi is given by 


A = Mi - Mi = ~ 0^" ~ 

Since y" - x" is an unbiased estimate of A, it 
to Theorem 1.10 that 


x") + W “ x')l (7.52) 
follows from the Corollary 


2<r 2 _ gd ~ (7.53) 


V(i) = Cov (y" - M<1 ” P m n(1 ~ ^ 


151 


SAMPLING 


theory 


obviously 

two occasions for making estimates of change. 

It is of interest to examine how good an estimate of change can h 
made by using simple averages on both occasions. In that case the 
estimate is simply 


A' = \y' + yy" - (Ax' + yx") = A (y' - x') + m(£" - $") 
and its variance is given by 

/O' 2 <7 2 /XT 2 \ /<T 2 (T 2 \ 2(T 2 

x! t + ^ _2 ^) + '‘W + W = T [x(1 ~' ,)+m) = 2(1 - x '’)^ 

and the ratio of F(A) to V(A') is given by 

1 — P 


(1 - p) + Amp 2 

Table 7.2 gives the relative gain in precision, namely Amp 2 /(1 — p), in 
percentage terms for some combinations of A and p. It will be found 
from the table that substantial gains in precision can be achieved by 
using the better estimator when p is high. 


T*bl*7J Percent gain in 
precision of A over A' as 
estimators of change 


p 

« K H 

0.5 

12 11 8 

0.6 

22 20 14 

0.7 

41 36 26 

0.8 

80 71 51 

0.9 

202 180 130 


current mmn < ^ rtf* ?' xerc ' sti 61 for making the best estimates ol 

not be revised ° h “ ge when the estimate for the first occasion « 
revised (as when it has already been published). 

all W «t° F SUM ° N TW ° ° CCAS,0NS 

of — of the MV ^ 

Im( 1 + p)(y" + *") + X (y' + *')1 ^ 7 * 


of tl 




* + PM 


4 

If p is positive, the best value of y which makes V (A) a mini m . j 
iously zero. This points to complete matching of the samples oq^jJ 8 1 







OOUBLE-SAMPLING PROCEDURES AND REPETITIVE SURVEYS 


U-ODJ 


If P 


And = c ov (y" + x", ±) = *£.+ ^ 

w(l + up) ' 

, u positive (which will ordinarily hold) the best replacement policy 
for estimating the sum on two occasions is to have M = l or X = 0 which 
means taking an independent sample at the second occasion. In case 
the sum is estimated by taking the simple average on each occasion, the 
estimator would be 


2' = W + x') + n(y" + x") 


with a variance of 2(1 + \p)(<r*/n), and the relative gain in precision 
achieved by using 2 in pl&c© of S 7 would be givGn by 


Xmp 2 

1 + p 


Table 7.3 gives the % relative gain in precision for different combinations 

of p and It will be found that the gain is not substantial even for 
high values of p. 


Table 7.3 Percent gain in 
precision of 2 over 2' as 
estimators of sum 


p 

X 

Yi H H 

0.5 

4.2 3.7 3.1 

0.6 

5.6 5.0 4.2 

0.7 

7.2 6.4 5.3 

0.8 

8.8 7.8 6.6 

0.9 

10.6 9.5 7.8 


Further reading For estimating the best values of the sample size on the 

first occasion and the subsampling fraction at the second occasion see 
Exercise 62. 


7J# REGRESSION estimation in repetitive surveys 

In Sec. 7.9.1 use was made of the difference estimator M im for forming 
estimate of Af 2 based on the matched part of the sample. Cochran 
63) and Jessen (1942) have used the regression estimator 

Mo m = y' -f- b(x — x') 

= y' 4- pb{x" — x') x — Ax' -f px" 


(7.56) 


160 


SAMPLING THEORY 


in place of the difference estimator. An approximate expression for 
can be obtained by substituting p for b in (7.56). In that case 


^ + pW 
nk 


/ 1 1 \ <T a <7* 

(- + —)- 2pp 2 — = — (1 - jip J 

\nX n\xj nk nk 


By weighting and = y" inversely to their variances, the estima¬ 
tor obtained is 


A 2 = 


1 


1 _ p 2 M 2 x') -f ky' -f p(l - p*n)y"] (7.57) 

X“^ t itttoinc vari r? t f imator (7 - 45) provided 6 is 

under the assumption that 6 = CalcU ! ated 

X by (746> “ d - 1 ** “o 1 


“ SA " , ’ UNG ° N "° K ™*N TWO OCCASIONS 

be extended to m ° re « 
-e the eam“ hl!^ 1 the “ m P le « and 

general notation'will now have't ^ are ft -Tp’ 

l. by " iththB ’AuaL^ ,rftlw rjl ’^ rv8li 

r* *=5 ss?- * - f a: 

where jfir. • ^ JUl " U -f , 

itionMn 1 th ?.“iuimum-va^... ' + *‘®‘ + (1 - *)fj 


U “ng the com?;,* * ^(l - *.) _ 

Coy (*L f A dltl °ua that Jft • 4 1 

V »-«.»») » p Cov , a , is Uncorjoi . . 

OftA 


get 

4 *p(i 


%h-2 8 


0a) 



pOUBLE-SAMPUNG PROCEDURES AND REPETITIVE SURVEYS 

Ml 

so that the minimum-variance estimator i B of the form 

= *nS» + (1 - 4.)[of + .(& ,, 

f M-Mi-i - **_,)] (7.eg) 

In the language of Sec. 7.10 this means that the estimator of nr • 
linear function of two independent estimators ®'.' and fl' V J* f 5‘-“ & 
The quantity * will be determined from the condition + thif *“ l * ' 

Cov(sl'A)»Cov(s;A) 

which gives 


*‘5 =(i ~*‘ ) K + ' ,c ° v ^a-)-p ! 4 

7l\ 

But Cov (S'»A- 1 ) = p Cov A-i) = pF(A-i) 


This gives 


= 


p ! F (A-0 + ^ (1 - p*) 

Tl\ 


p 2 V(Mh-i) + ^ (l - p*> + - 

71A flu 


Further 


(7.59) 


7 (#*) = Cov (£",Af ft ) = — 

nn 


(7.60) 


The question of the limiting value of 7(M A ) when the optimum values 
of A* and p* are used has been considered by Cochran (1963). Let 


V(tt h )=-G k 

n 

From (7.59) and (7.60) we have 


(?i = 1 


1 _ _1_ 

Gh h (PGk-x Hr (1 — p 2 )Aa 


(7.61) 


The variance of will be a minimum when the quantity on the right- 
hand side of (7.61) is a maximum. Simple calculus methods give the 
best value of \ h as 


_ Vl - p 2 

G h - i(l + Vl - p 2 ) 
Substituting this value of X* in Eq. (7.61), we get 

1 , , (1 - Vl - p 2 )* 


— = 1 + -- 

G h ^ p 2 G h -i 


(7.62) 


162 


SAMPLING THEORY 


Thus the limiting value of G h is 

lim ft = 0 = - (1 

and hence 


- P 2 )] 


h—> oo p* 

lim \ h = y 2 = Jim ^ 


(7.63) 

(7.64) 


7.12 A USEFUL PROCEDURE 


d some monthly surveys it may be possible to collect data from the 
respondents for the current month as well as for the previous month (for 
example retail-trade sales of shops). And it may be considered desirable 
not to rotate the same units month after month over the year but spread 
the burden of response evenly over the population. The following sample 
design may then be adopted. Eveiy month an independent random 
• ample of establishments is taken. During the enumeration each member 
of the sample provides data both for the current month and the previous 
month (Eckler, 1955; Woodruff, 1963). 

Let ft and denote the sample means for the Ath and (h - list 
occasions, respectively, based on the sample of „ units taken on occasion h. 
We sha 1 assume that F(ft) = v V n on each occasion and that . is the 

S: v°:fr nt We “ conseoutiTO periods. The best linear 
estimator M h of the mean on occasion h will be of the form 


M h - y h — a h x k ~i + ai = 0 

an< * &h-i = y h _ x — a h _ + ah~iMh ~2 

The condition for M h to be minimum variance is 


Cov A) - Cov ( yh-iA) 
since both x h -i and refer to the same occasion. 


But 


Cov (ft_iA) = 0. - mo t 

n 


And 


C ° V (^A) ~ a * Cov (fo-i,!#*.,) 


Hence 


~ a *(l — pa\„x) - 

n 


°‘ (1 - eok-i) = p - a» 


which gives 


a* = 


2 - pa,., - 0 


f 


163 


doU ble-sampling procedures and repetitive surveys 

rpke variance of M h is given by 

vm = cov &A) = (i _ POt) t 

n 


Further reading 

1. In Sec. 7.12 the sample numbers are the same on each occasion. If 
the sample sizes vary, the procedure indicated in Exercise 64 may be 
followed. 

2. If the estimate for a month is made after data for the succeeding 
month have been collected, it is shown in Exercise 65 that this is equiva¬ 
lent to the procedures of Sec. 7.11 with 50 percent overlap. 

2. If the sample contains unexpectedly large units, their inclusion as 
such will inflate the sampling error. In Exercise 67 is discussed a method 
of reducing the impact of large units on the sampling variance. 

REFERENCES 

Cochran, W. G. (1963). “Sampling Techniques,” 2d ed. John Wiley & Sons, 
Inc., New York. 

Eckler, A. R. (1955). Rotation sampling. Ann. Math. Stat., 26. 

Jessen, R. J. (1942). Statistical investigation of a sample survey for obtaining 
farm facts. Iowa Agr. Expt. Sta. Res. Bull., 304. 

Lahiri, D. B. (1951). A method of sample selection providing unbiased ratio 
estimates. Intern. Stat. Inst. Bull., 33. 

Patterson, H. D. (1950). Sampling on successive occasions with partial replace¬ 
ment of units. J. Roy. Stat. Soc., B 12. 

Raj, D. (1954). Ratio estimation in sampling with equal and unequal proba¬ 
bilities. J. Ind. Soc. Agr. Stat., 6. 

(1964). On double sampling for pps estimation. Ann. Math. Stat., 35. 

(1965a), On a method of using multiauxiliary information in sample 
surveys. J. Am. Stat. Assoc., 60. 

y— (19656). On sampling over two occasions with probability proportionate 
to size. Ann. Math. Stat., 36. 

*ao, C. R. (1952). Some theorems on minimum-variance unbiased estima- 
h°n. Sankhya, 12. 

Woodruff, R. S. (1959). The use of rotating samples in the Census Bureau’s 
Monthly surveys. Proc. Soc. Stat. Sec.. Am. Stat. Assoc. 

Yates, p. (J 949 ). “Sampling Methods for Censuses and Surveys.” Charles 
Griffin & Company, Ltd., London. 






CHAPTER eight 

nonsampung ERRORS 


\ 

/ 


i 


8.1 INTRODUCTION 


l the theory presented in the foregoing chapters it has been assumed 
iat to each unit U t in the population is attached a value ye caUed the 
ue value of the unit for the character y. It has also been assum^that 
•henever Ut is in the sample the value of y reported or observed on it is 
Tt is important to take a good look at tHese ^umptio- Wdh 
ome characters, such as the age of a person, or number 
he idea of a true value is not difficult to contempMe. 
lituations, such as the attitude of a person °” senseless to con¬ 

cept of the true value is harder to ^ at f"^ue reported or observed 

template it. However, the assumpti . reoorts it and under what 
on unit Ut is always y„ irrespective of "ho reports* an 

circumstances it is obtained, is an To give*n 

Actual survey experience does not suppo in Greece in which 

example, the author (Raj, 1965 ) made anmv«Wdie by 

aU the parcels of land situated in five communes w 


166 




tural Officials and the name of the operator recorded. Later on „ 
farmers in the communes were asked how many parcels they operaM 

by the ground survTi? The ^ nU ”, ber °l Par ° eIs operated <“ found 
errors n f ™ ^‘ ^ ere ls no dearth of examples to show that 

x: outT tion ’ or errors of — are ~ 

re^sXrrtrLtauT 6 of ~ response th * 

this chapter we shall 'ace the n w f “ samphn « «™n> alone. In 
and devise methods for the p blem of the Presence of response errors 
it fa possibfato TJa h ! 1 " easurement of these errors to the point that 

devoted to tie contofof Pr0P ° rti ° n ° f the ‘° tal bud « et should be 

the control of slmS reSP ° nS ? em>rS and What should be devoted to 
Irt °T * samphng errors. In addition to response errors surveys 

are subject to errors of coverage, processing errors, etc. A study be 
made of some of these errors in the latter part of ihis chapter 


8.2 RESPONSE ERRORS 

to the urnT f r’in tha ‘ there is a tra * * attached 

■ r the population. An interviewer assigned to collect 

W? Well k‘ P ‘Th ' f he /u Ie 0f a perSOn wh0 is fyias to Shoot at a 
target. We all know that if the target is shot at a very large number of 

T v^rielll 1 A ra th Cr the ? 3viations > form a distribution with a mean and 
tiol mth a X * marksman will generally produce another distribu¬ 
tion, with a different mean and variance. We are going to carry this 

na ogy over to the realm of collection of information. When an inter- 

xxrtr : umt for c °" ec ^* o n Z::x: e 

vtraXX I T TIT 0btained is “ observation on a random 
X : tlTXX t,0D - Different “terviewers will produce 
theTntl lwlr T,l P J g UP ° n their skill < the interaction between 
Itwina tTd r 1 re r" d “‘- “ d s ° 0n - When it comes to inter- 
Z eLoZf X A” by tbS Same person - experience shows that 
inJZZX^ vf ?! be aSSUmed t0 be uncorrelated. The 
thaThe has 11? obse ™«ons he produces. The fact 

observation on^he (rtherunft ° T^f tl* “ bit 

check of the 1958 C root r , _ autbor noted during the post-census 
did not find many e«t hr? 1 * 80 Establishments that investigators who 

du ? g the ear,y part ° f their 
job and produced a very high rate of 






caWlPUNG ERRORS 

fjONSA Mr 167 

of others who had started differently. 
ie interviewer appeared to be correlated, 
e presence of correlations within inter- 
vao „- ...... be Carryin S otters too far (thereby 

making ana yS1S °^ ata * 00 com Phcated) if we assume correlations 
between the response obtained by one interviewer on one unit and that of 
another interviewer on another unit. A uniform system of training of 
the interviewers and other procedures may bring about such correlations, 
but we are going to ignore them. Another point to remember is that the 
distribution of responses produced by an interviewer is going to depend 
on what may be called the essential conditions of the survey. In a 
thorough survey with considerable resources, in which great attention is 
paid to the problems of training, interviewing etc., the distribution will 
be different from one in which all that is considered important is writing 
out a questionnaire and ordering some persons to fetch data at a moment’s 
notice. Thus while speaking of the random variables involved, we shall 
always have in mind the essential conditions of the survey which deter¬ 
mine these distributions. 


e as compared with that 
'"'Responses observed by the sar 
„ e thus going to recognize tl 

W „,„„ r assignments. But it will 


8.3 RESPONSE BIAS 

The reason that the interviewer is being brought into the picture for the 

study of response errors is that modern large-scale surveys are usually 

conducted with the help of interviewers specially trained for the purpose 

in order to get worthwhile results. We shall assume that a large number 

M of interviewers is available for the survey (Hansen, et al., 1951). The 

response Xij k obtained by interviewer i on unit j is a random variable 

(this is the heart of the assumption), possessing a distribution with 

Hx ijk ) = Xij and V 2 (x ijk ) = &/. The average of responses obtained by 

interviewer i on all the N units in the population would be ^ Xu /N = Xi, 

i 

and the average obtained by all the M interviewers available for the survey 
would be ^ Xi/M = X. This may be called the expected survey value, 

the true value being J Yj/N = Y. Our target is to estimate Y and hence 

the difference, X - ’?, between the expected survey value and the true 
value, is called the response bias. The response bias would obviously 
depend upon interviewing procedures, the questionnaire, an t e raini g 
of personnel. Unless proper procedures can be devised which would 
BUarantee a small response bias, it would not be worthwhile to go ahead 
with the survey. 


SAMPLING 


theory 


M THE ANALYSIS OF DATA 

Since the response is going to depend on who interviews whom, ther 
should be proper randomization procedures for the allocation of the 
sample interviewers (selected out of the M available) to the sample units 
(selected out of the N units in the population). We shall be discussing 
the theoretically most simple situation, in which a random sample of 
h = n/m units is selected from the population of N units and assigned to 
an interviewer selected at random from the M available for the survey 
Another independent random sample of n units is selected and assigned 
to another interviewer selected at random from the M . In all m such sub¬ 
samples, each of size n, are selected and assigned to the m interviewers. 

Calling this sampling scheme scheme B, we shall prove the following 
theorem. 8 


Theorem 8.1 

Under- scheme B, an unbiased estimator of X is provided by 

S S xnk 


x = 


with 


where 


Sii 

m 


= * J 


hm 


V(x) = 


V(x) 


n 


+ 


a - s 


( 8 . 1 ) 


(8.2: 


2 2 £ (*«* - £) 2 
V(x) = J-J_ 

MN 

MN(N — i) 


(8.3 


(8 A 


* 

, / 

and an interviewer is picked ^ ( ra ^ orl1 from the population of N uni' 

the selected unit, the expected value om°“ fr ° m the M and assi 8 ned ' 

so because for a given interviewer , res P onse will be This 
Hence for fixed i, E ,(**) = y y , “ d a Siven unit U h E 2 (x iik ) = if 

y wand therefore 

mi. to y n - x 

Pr ° VeS that sample mean - o . 

* x *jk/n provided by the *t 

i 


MPUNG 


ERRORS 


fjONS* 

. (of the interviewer) gives an unbiased estimate of X. Hence 
se jectio _ x The bias in x for estimating Y will be 

EW " x - Y. In order to find the variance of x we shall make repeated 
^Theorem 1.8. It is to be remembered that the probability that a 
use 0 f units is assigned to an interviewer is n(n - 1 )/N(N - 1) and 
P air , nce that a specified interviewer is selected at the ith selection is 
c s hall begin by finding F(x<) by making use of Theorem 1.8; 

that iS; 

V(x%) = J^i7*(ft) + V iEt(xi) 

. t ^ e conditional expectations are taken keeping the interviewer and 
ttossmple units assigned to him fixed. We have 

SX tl 

E z (Xi) = *—z~ 
n 

S Sif “H S C zfaijkyXij'k') 

jr /-N _ i i+f _ 

V 2 \X%) — 


n 


n 2 


- 1 


y y c?.(xijk,xij’v) ( 8 . 5 ) 


ElViixi) = £ £ s v + mN (N - 1) 4 A 

Now, VE 2 (xi) = V(hi) = EiVi(hi) + Vi E 2 (K) k = E 2 (xi) 

where the conditional expectations are taken assuming the interviewer 
fixed. 

We have E 2 (k) = jy 

y /r>y I V /jr . — X-)2 -|-—--—- y — Xi)(Xij> — %*) 

{Xii %) + nJVW - 1) A 


= X 


jVi' 


t ; 


+ «-i __ y y - x.o(x«. - *) (8-6) 


MnN(N - 1 ) 4 

X (* - 


(8.7) 


Adding the terms given by (8.5), (8.6), and (8.7), we get V(xi>- Now 
^(*) = FftO/m, since the random variables «.(•-!. 



170 


uncorrelated. Hence we get 


sampling theory | 




V(i) 


*) = 4m 11 ( * 1 - z 

nMN h y 


- X<y 


l 1 

+—— y y (Xu - Xi)(x i)f - Xi )+ 

+ nMN(N - 1) 4 4, ^ - 


+ 


n hi 11 sv+{n ~ vwmir - 1 )]-11 c,^,,) (8 . 8) 

» i i j?iji 


We shall now define the variance of x ijk over the population of all 
viewers and all units as r ' 


1 vv ll&i-xy 

ViX) -Hill B ^ ~ W -TTTJ- 

» j NM 


l H 

+ --- *' i 


S-.2 


M 


+ 


NM 


(8.9) 


individuality It 'Z ° btained from diSerent 


CM = l l - 1) 

i 3*j J MN(N — l) 

= y V ? (li £ X - *<)(*#■ - *•) 

T & MN(N - n + ---- -l 


ilf 


+ 


Usi “8 (8.9) and (8. 10 ) we find that 

v (i) = ® i («~ y)c< x.D 

n 1 


M#(1V - 1) 


( 8 . 10 ) 


n 


- Cfa/) 

=^+v ^ m 

n + \m~n) ^ 


Corollary 

defining b v 

the variance ^ - he int ra-intervi» 

' * be wri tt ^ r delation coefficient of reap 


v (i) = S*). 

n ll + (fi- l) p ] 



n onsampling errors 


171 


Corollary 

MSEW = v ^\+ (* ~ Y)*. Thus the mean square error of - 
*• true mean 7 is the sum of the varianee of ?£Z '^“ofTht 

j*gspons® DiftS* 

Remark The quantity V(x) is the variance over all resDonso^ r f 

■“ unitS t0 aI1 “ eW !T quantlty ° M) is covariance 
between responses obtained from different units by the same interviewer 

With some characters, that is, m cases in which the interviewer is required 

to make an estimate, such as eye-estimation of area under a cro D in a 

fidd, the covariance C(* /) may be substantia] and may form an important 

part of V(x).' On the other hand, with factual items, such as the number 

of persons in the household, their distribution by age and sex, etc the 

interviewer may have little to do with the response obtained from the 

members ct the household and in this case C(x,[) may be assumed to be an 
unimportant component of V(x). 


Remark If survey procedures are such that the response bias is quite 
arge relative to V(x), the variance of the sample mean will give a mis¬ 
leading picture of the accuracy attained by the survey. It is the total 
error of the estimate, measured by MSE(x), which is to be made small and 
not simply the variance of x. 


Remark If n - 1, that is if each interviewer enumerates just one unit, 
ne covariance term drops from the expression for V (x ). 

Remark Scheme B of this section is due to Mahalanobis (1946) who 
Ca s the method of interpenetrating subsamples. The m subsamples 
are interpenetrating in the sense that each is a probability sample over 
the population. 

reading See Exercise 71 for an extension to the situation in 
lc h a random sample taken from the entire population is allocated to 

strata. 


g £ 

THE OPTIMUM number of interviewers 

em T he0rem 8*1, the variance of the sample mean based on a survey 
V( 'i°^ hrterviewers is made up of two components. One component, 
Th* ’ ^ var ^ a hility of all responses over all units to all interviewers. 
6 °ther one, C(x,7), is the covariance between responses obtained from 


SAMPLING 


theory 


different units within interviewer assignments (called interviewer cov ' 
ance). If advance estimates of V(x) and C(x,I) are available which ^ 
be assumed to be usable over the range of m (the number of intervie ^ 
envisaged, it is possible to determine from (8.2) the optimum numb^ 
interviewers to employ for the collection of data. Assuming a sim 1 ^ 
structure, let c x be cost per unit in the sample and c 2 be the cost D * C ° St 
viewer, so that the total cost of the survey is given as C = c 0 + c Pe V nter ’ 
By the method of undetermined multipliers, the values of I + 
found which minimize foe a given cost. The two eq L t io^ 2? ^ 
by equatmg to zero the derivatives of V(x) + x( Cri + , „ “ btamed 
with respect to n and m are: ^ + ( o + lU + c 2 w — C) 

Xci = H?) ~ x C(x,I) 

n 2 AC 2 -- 

Hence 

\ 

~ = ( C S] K [<?(*,/) i« 

The actual values of ” , (8 ' U) 

ot “/n n pv® bv f 8 b n i , ned o by substitu tion in the cost 
ortmardy themselves dependTn tL n "k S “ Ce C <*- d ) “ d V (*) would 

used for 6 ^ erviewer assignment th* w* ° f mtervie wers used and the 
wsely this ?* 1 ? 6 ““ idea of ^e magnitijf* 1 ® ° btained sl »ould merely be 

“ a teau:r d 7 r °" d « *> ‘H ”s: d 0 r invoived '. «£* 

*) and nonsamnl- ° Ward tbe re <iuction of 8 r ? anner 111 which resources 

Sampling errors (interv^* l SampImg (as judged by 

ewer errors in this case). 

2 ? 

F(I) ' to S? d fr 0f “^penetrating subsamples (Sec. 

X Tha * estiml he Sample iWf an estimate of 
12 mates ara made in Theorem 8.2. 


,JnderKh ^, unbtal 


“^e* 0/C( 

q^ x /) -s , * an ^ ^(*) are provide 

T. (8. 

(l) 3 V + 

q . ft (8* 

p (i) a 7^~ *9* 


( 8 . 1 *) 



n 0 nsa»m ,ling *" rors 


173 


wh#* 


Sb 2 = S 


Ufa — x ) 2 
m — 1 




s v 2 = S S— k - M 

,■ ^(n - 1) 


proof As % — S %i /and since z* are independently and identi- 

t 

Uy distributed random variables, we prove by Theorem 3.4 that Eq. 
(8 14) holds. At the same time it follows that 

E(sb 2 ) = nrriV (x) = nV(x) 


n 


= V(x) - C(x,I) + - C(x,I ) 


m 


Now 


S S (xnk - ZiY = SSxU - \ $ (S Xi > k ) 

i i * 3 


ESSx' m = ESS(XJ + &,’) = Gif'll (V + V) 

Given the ith interviewer, E (S^ak) 2 = (ESx ijk ) 2 + 

But (i®sM* = *«)’ = 

7(jSfa^b) = EiVt(Sxii*) + ViE^Sxnk) 

E 2 ($3'ijk) = ^ ^ 

as(x, - &)» , *(«- » y ( x, _ XKXi. - *<) 
V,E,(Sx m ) = -^ + JV(AT - 1) ,4 

7s(Sl ( y») = 5 >V + 5 Cj(z#h*«uO 

3 j'*’ 

/a \ ** V <? 2 4- —^ ~ — V C^XijkfXij’k’) 

ElViiSxijk ) = jy 2, + J\T(AT - 1),4 

Hence, given the ith interviewer 

n £ (*# - ^) 2 
*(&*)• = n 2 Xi 2 + - J - x 

«(»- x >_ y (*„ - &)(*«■ - 1 ' ) +§ | Stf! 
w ” 1} ** «arJi y 

+ AT(iV - 1) ki 


+ 


174 


Thus 


m v' 

E(Sxijk) 2 = ^ 2 , E ^ Sx ^yw 

I 


Hence 

- *)■ - s II <*«’+*■> - 5 [* 2 ^ 

(Xi,- — Zj) 2 ( n — 1 V V /w 
i W + N(N - 1) 4 ( 4 (X,J ~ (£«• - J?,) 

l v 




Thus 

Also 


Hence 

and 


JV(AT - 1) 4 A 
' ♦ jV/ 

+ S ? £ Siji + N(N ~ 1) ^ 

-■<—idiw+i'fxo,-*,. 

1 * j 
~ 2 I (*« - *<)(£«. - XA 

i j'ptj ' x > 

HN(N - 1)Z ^ C *(*<WV».) j = m (n _ _ 

E M = 7<j») - C(x /) 

E($b 2 ) a= T^(a;) /n(/ TV n 

(,7) + » c(*,o 


CM 


CM) 

n*> 


m 

n 


m 

s„ 2 ) 


« 2 , m 
+ n W 


*c«e r * B - • ' » “ S " S) 

^tigating^J°®Pleted a 81m , 

‘Mm Urvey atKl °dcte mple ° f Units ' 0yinE m interviewers, e 

For t number of?** the optbCm “ **“■* “d C( 

6 0084 function °f ^ United"* T“ ber <^) of interview 

^ 8 5 > uu es^ b ! USed in a future s 
* (ci/cjjtir/ a ‘ e ° f m «/-o is given by 

- S „*) /S]V4 


Wlf 

n 0 


ft '«•(> 

v m ° r * It i 

V % bfe fact tj 

° d “ of i n C t a “ ^ mad e ^“ubiaaed estimate of the 

Remark T . 1Qlate s 0 f • grating s , m t ^ xe presence of res 
UQ bi a8ed { each erv iew er Car 8 ^ mples is indeed si 

"<^Rr “ ■“»■— 

' V0U)ti be 0 ?, atl °u on just one un 
i KXtii ~ £)7n(n - 1). 





„ on «^' ng errors 

. , ide s with the estimator of Theorem 3.3, used in wr simple random 
Thus the U8Ua varlance estimator is all right provided an 
s *. rviewer investigates only one unit. But the estimator of the mean, 
-of See- 3.4, would be a biased one in the presence of response errors. 
1/ 


ark I 11 practice an interviewer collects information from several 
T n this situation the estimator S (x ijk - x) 2 /n(n - 1) would be 

units* j 

appropriate. The proper variance estimator is given by (8.14). 


s>7 SOME RESTRICTED MODELS 

theory presented in the earlier sections is quite general, the main 
. Qn ma( j e being that the response obtained by interviewer i on 
ass .'f .*? g a ran dom variable. We shall now discuss a particular case of 
U hi model To start with, a check will be made of Formula (8.2) for the 
situation in which x ijk = Vs (the true value of unit j). We have 

Xij = Vi Sij 2 = 0 Xi = X = Y E(x) = Y 
V(x) = (N — 1) C(x,I) = TT~ CzixijkjXij'k') = 0 


N 


Hence 




This agrees with the sampling vanance ot y - (l/m)Sy< m which 
independent samples each of size n are selected at random, and in which a 
unique value is attached to every unit. If the responses are su j 
constant bias so that = y, + o, V(i) remains unchanged but x is 

“ ll C;S 

He rwpo.se (blamed by mWrvi.-.r ^Q ^ yaj is 

x* = ft + o,- + whereS,W - °- W v J ables a nd 

the bias associated with interviewer i. model we have 

V are assumed to be uncorrelated. For this restricted model, we 

1.-* + * * * = f + 4 &i ’ = OV 


SJ 


v(z) = (i - i) s,* + (i - M ) 
c(z,n = (i - ~h s 


+ S. : 


E(x) = Y + a 


i 




Hence 




171 


sampling th Eory 


These results are illuminating. The sample mean does not gi Ve 
unbiased estimate of Y unless the individual biases o* of the interview 411 
average to zero over the population of M interviewers. Secondly pY-f* 8 
built up of three components. One component is the sampling varia ** 
obtained when there are no response errors whatsoever. The other fact 6 
(which is to be added) is a function of S a 2 , the variability of the biases of 
the interviewers. Also, there is the variance of individual response 
deviations. This points to the need that survey procedures be so 
designed as to ensure that a is about zero and S a 2 is as small as possible 


Further reading The-optimum number of enumerators for a certain cost 
function is obtained in Exercise 70 when the additive model is assumed. 


8.8 UNCORRELATED RESPONSE ERRORS 


n certain types of surveys there may be evidence that errors of response 
can be assumed to be uncorrelated from one unit to another in the sample, 
or example, this may happen in surveys carried out b„ mail in which the 

U W S far and wide are ^ired to fill out certain question- 

.. 6 W1 ® u ^ ^ situation the effect of response errors on the 

random n Pr ° 1 Ce , res iacusse< l m the previous chapters. A wtr simple 

r sDol n if °i U UmtS 18 taken - The model assumed is that th^ 
response on the jth unit in the sample may be exhibited as 


where 
The e jk ’ 


Zjk = Vi + e jh 
= 0 F s (e*) = S,,.! 


(8.15) 


by i = Zl/Tweta™ t0 ”*■ Denoti,l e the sample mea " 




.2 


n N 


Hence 


T iEt(x) = (i __ 


n 


E(x) = y 


( 8 . 16 ) 


m-(l-/)2L + I w 

This shows that, provided W 1o 

mean is an unbiased estimate ^ S ?! 0nse errora average to zero, the samp 
“not simply the cua “ Ration moan. But its «*£ 

Theorem3.1). ItgeteinflaWb^a ZZTT* ° f the 

y quantity depen din g upon the van 




^ONSAMPUNG 


ERRORS 


177 

sam- 


&’ Of individual response errors. In order to estimate V(i, f rom the eai 

pie, suppose we use the customary variance estimator (1 - ‘ 

- 1). We have v x) 

E S (xjk - x ) 2 = Ei S (S ej * + yf) - nE(x 2 ) 

i i 

= N^t + Sei ^ “■ n \y* + f (*)] 


= n ( n ~ n(n — 1) 


— i) ^ 

1 - / N'O^fj l Sei? ‘ 


Hence 


$ (Xjk - x ) 2 


m -f) j - 


n(n — 1) 


= V(x) - 


2 SJ 
N 2 


(8.17) 


This shows that the customary variance estimator slightly understates 
the true variance. 

The model (8.15) can be extended by assuming a systematic bias of 
Oj associated with the jth unit. 

Then x jk = y s + a, + e jk = y] + e jk 

where the 6j k s nre assumed to be uncorrelated. The new formulas are 


B(x) = Y + a K(z) = (l-/)^ + I5§d ( 8 . 18 ) 

n n N 

Thus the sample mean is sub j ect to a response bias of a. The new variance 
formula differs from (8.16) in that the variance of y) is involved in place 
of the variance of y the true values. Regarding the estimator of the 
variance, formula (8.17) continues to hold. It must be pointed out here 
that actual survey experience in certain fields shows that biases are 
exceptionally stable. As a result, when surveys are regularly conducted 
over the same population, it should be possible to make estimates of 
change (Sec. 7.9) with very small response bias provided the essential 
conditions of the survey are about the same at the two periods (see also 
Exercise 77). 


8,9 ESTIMATION of response bias 

As stated in Sec. 8.4 the response bias, X - Y , may be an important 
component of the mean square error of the sample estimate. If this 
component is large, it is not worthwhile trying to reduce the other com¬ 
ponents. The problem is how to make an estimate of the response bias. 


in sa Mpling THEqry 

Since it involves Y, the mean of the true values, the response bias c 
be measured as such from the survey. If an estimate of ? i s avaVk^ 
from another source which is believed to be very accurate, the re ^ 6 
bias can then be estimated. But often it is difficult to make sure tl? 86 
the other source is very accurate. Another approach in recent years h 
been to conduct a small-scale study after the main survey in which “ t 
interviewers are used and the procedures employed are more careful 
detailed. The differences of the estimates based on the main survey and 
the small-scale survey throw some light on the response biases involved 
At least one can find what difference more careful procedures will make 
in the results. This method is being increasingly used to estimate the 
response bias of census data. The small-scale studies are called post¬ 
enumeration surveys. Occasionally it is possible to match the survey 
data with figures believed to be true values of the units involved. Such 
comparisons throw light on the size and direction of response errors. The 
present author (Raj, 1965) made such a comparison between farmer's 
reports on the number of parcels operated along with their areas and the 
corresponding data gathered through a ground survey. The so-called 

rpr»nr+ Vera ^i 81Z f ° f / h ° lding was found t0 be 8.7 parcels, while the 
na.ro iq ^ UCe a ^ ure parcels, the response bias being —2.1 

L nf J!/ areaSl the averages were 42.2 and 37.8, giving a response 

correlate wwk 8Ured stremrnas )- The responses were found to be 
for the ^r^ ^ 1 ValueS ’ the Nation coefficient being -0.39 

fiSWLW ^ ~°- 23 for areas - ^ surveys of this type 
response bias • er ? con centrate efforts toward the reduction of 
we of the ™A r0Vlng surve y Procedures) rather than increase the 

by Kish and ° f th ‘ S kind ' ° ne haS ^ 

of homes (see also Exerts 75 ^ 7 “ ) ° 0nSldered being 4be P 

mated bias^Tbta^fwh^thl 7 Wh<ir6 the etandard error of 4)16 
for estimating a population proportion^ “ reP6ated ” ^ “““ 


f! rr 10 °™ er sa " pung des,gns 

restricted to the , aaal y sis of response errors h< 

selected from the entire nnnnW 1<>b a s ’ m Pl e random sample ht 

40 “‘her sam P hnVd^ P P t r-.-7 he me4hod can be easily en 

assume that a large numhpr M ^. ra ^ 1 ^ ca ti°n has been employed * 

hth stratum from which ™ ' lntorv 'ewers is available for worl 

allocated at random t “ ted a ‘ random. These inter 

random to m. psu’s, one to each, and selected, « 




ERRORS 


N0 NSAM pL,NG x „ 

\ placement with pp to z. From thejth psu in stratum h a sample of 
' 1 second-stage units (say households) is selected at random from the 
u[. in the psu. Denoting by x hi ^ the response obtained by interviewer 
nn the kth household m the jth psu, the population total Y = 2F, 
* oU ld be estimated by 

7 mh i VKjnihj k V m S- 


shall assume that given the stratum h } the interviewer i, the psu y and 
the household k , 

E ^(xhijku) = Xhijk and V 2 (xhijku) = S 2 h%jk 


and that responses taken by the same interviewer on two different house¬ 
holds are not necessarily uncorrelated. But we shall assume that the 
random variables Xhijku and Xhi'j>k>u' are uncorrelated. In stratum h there 
will be interpenetrating subsamples, each providing an unbiased esti¬ 
mate of Xh = £ £ £ Xnk/M k , the expected survey value for stratum h, 

i j k 

the true value being Y h — J £ £ y,k/M h . The estimator of X h based on 

i j k 

the ith interviewer would be 



1 Mhj C 

-O Xhijku 

Phi mhi k 



And an unbiased estimator of F(Z) would be given by 



1 

m h (m h — 1) 




Ml RESPONSE AND SAMPLING VARIANCE 

We shall now make a more detailed examination of the mean square error 
of a survey estimate when response errors are present. An attempt will 
bo made to separate the response variance from the sampling variance 
(Hansen et al., 1964). Suppose a unit is selected at random from a 
population of N units in order to estimate the population mean Y. Let 
the response on the selected unit Uj on trial t be x,t. Given this unit, let 
E ( x Mi) = Xj and further let E(Xj) = 2Xj/N = X. We shall then define: 

Response deviation: d jt = x jt — Xj 

Sampling deviation: 5, = X, - X 

It is easy to see that E(x,t) = X, so that B = X — Y is the amount 
°f bias present in x it for estimating Y. For the mean square error oi x it 


ISO 


sampling theory 


we have 


MSE(z„) = E{x h - 

“ E( x it — 7)2 

“ £(<*#, + b, + By 
= E (W) + «(« + 2E(d il S i ) + B> 

- V(i») + V(t,) + Mdj'MSMW + B> (8.19) 

ter ^ in Eq ' . (fU9) WiU be calIed «* res P on se variance; the 

!“i‘ he “ pl, , ng r aDCe; and the third ’ the oovariance between 
response and sampling deviations. 


i 


Theorem 8.3 


Let a simple random sample of n units be selected with replacement from a 
population , giving rise to responses at trial t as 


^ 1 1) t) • • • ; Xflt X - 


Sx,i 


n 


Then E(x) = X B{x) = X - Y 

and the mean square error of x around Y is 


MSE(x) = - V(d it )[ 1 + (» - l)p(dj,,du)] + 7^' 
n n 


+ 2 Cov ) + {X - ?y 

\ n 71/ 


proof Using the expression for MSE(%) from Eq. (8.19), we have 


Sd„ , rr SS i , .(Sdt SsA 


MSE(x) = 7 ^ + 7 ^ + 2 Cov (+ (£ - 7) ! (8.20) 


Furthermore 


Sd 


n 


= e y= 


E 


W , n „S'dj4 kl 


w 


+ 2E 


n‘ 


= - 7 (dj t ) + --- Cov (dj>,du) 

n n 


1 


= I VW)l 1 + (»- l) P (d jh dkt)} 


n 


( 8 . 21 ) 

■ 


The proof of Theorem 8.3 now follows. 

Remark The quantity n-'V(du) in Eq. (8.21) is the simple variance 0 * 
e response deviations and is called the simple response variance, 
second term reflects any correlations among the response deviations fr 001 
one umt to another within the survey trial. 






n0 nsampung errors mi 

Remark The first term in Eq. (8.20) will be eaUed the response variance; 
th e second, the sampling variance; and the third, the covariance between 
fgsponse and sampling deviations. 


Rem#* The precision of the estimate depends on the sampling variance 
the response variance, the covariance between sampling and response 
deviations, and the bias. In case response deviations are all zero the 
precision will be governed simply by the sampling variance and the bias. 


Further reading See Exercise 79 for an extension to pps sampling. 


8.11.1 APPLICATION TO ESTIMATING PROPORTIONS 

The results obtained in the previous section will now be applied to the 
problem of estimating the proportion P of units belonging to the class A, 
when a wr simple random sample of n units is taken. Suppose the 
response obtained on unit Uj is a random variable which classifies the 
unit to the class A or not A according as x jt = 1 (with probability PA or 
zero (with probability Q,). 


Then P = ~f E(x jt \j) = P } E{P,) =^ = p = E (p) 

V(dj t ) = E(d ]t 2 ) = ~ p i) 2p i + Pj 2 (1 - Pj)] = ZP/l ~ Pj) 

N n 

F(« - 

N 

Hence the simple response variance is 2P,-(1 - Pj) / n N, and the 

. ^ * j *. j . ... ^ ^ ^^t^cr assume that the 

within-trial response deviations are uncorrelated and so are the sampling 
and response*deviations, 63 


MSE(p) = 


P( 1 - P) 

n 


■ + (P- py 


( 8 . 22 ) 


emark The sum of the simple response variance and the sampling 
variance equals P(1 - P)/n, which is analogous to the well-known expres- 
aon P(l - py/ n taken as the sampling variance of the estimate p (see 
xercise 69). This shows that, in this case, the customary expression for 
e 8antt phng variance includes the simple response variance. 

f r 

Remark The simple response variance has an upper limit, namely, 


i V(dj,) < ft 
n n 


n 


7 


(8.23) 


112 


SAMPLING theory 

Since a large value of the response variance should indicate greater 
inconsistency of classification of the units to the class A, we may Uge 
V(djt)/P( 1 - P) as an index of inconsistency of classification. 


8.11.2 ESTIMATION OF SIMPLE RESPONSE VARIANCE 

We will now present a method of estimating the simple response variance 
when the survey is repeated, and the purpose is to estimate the population 
proportion P. Let the survey be repeated independently under identical 
conditions using the same sample on both trials t and t'. The following 
four-fold table of frequencies will then be observed (Hansen, et al., 1964). 


Original 

x n * 1 xn = 0 


Repetition = ^ 

a 

b 

o + 6 

Xj t , = 0 

c 

d 

c+d 


a + c 

b+d 

n 


Consider the statistic g = S(x„ - *„,)«/„. W e have 


■E(0) = S(x„ - *,,.)« = y (Xjt _ Xji}) 

= - d jt .) = 2V(d jt ) 

^^Since^ ^ Unbiased estima te of V(d jt ). 

given by (b 4- H)/2n^'v, ^ estimate of the simple response variance is 
is provided by (6 +° f the indeX °' inconsistenCy 


U2 THE PROBLEM OF NONRESPONSE 

Another source of prmr 

information from arge ~ scale surveys is the nonavailabikt 

is mailed to a samnle in tbe sam P le - a question! 

respond. If vi s it s arp j ^hshments, some establishments will fa 
will be found to be awavl! sample of households, some househ 
An obvious solution in flip fi T ° me and ot ^ ers may refuse to coopd 
If this does not remedy the lit ^ W ° Uld be to continue issuing remi» ( 
again. Or, a subsamnl P ^ atlon ’ actual visits may be made agaio 
and Exercise 73) and^ll ^ n °nrespondents may be taken (S*jj 
procedure can be usedfn T*** directed toward them. 
recalling until it i a f min j ., e ® econd ca se. We continue recalls* 

8 f ° Und that * is no use making further calls- 




NON sa 


M PL!N 6 

calls 


ERRORS 


113 


I 

v er C8JID are made on a subsample only. In case the cost per com- 
ted schedule based on recalling is far higher than that based on the 
P ,ete % «.n ingenious device has been proposed by Hartley 


' call an ingenious device has been proposed by Hartley (1946) and 
.eloped by Polit* and Simmons (1949, 1950). It is used when the 
)V . r 0&n se of noninterviews is the absence of the respondents from home 

Q.lOr 1 1 l i 1 1 r*.« 


de 


-oior cause ~—----— 11V/U1 ilUlllC 

hen the interviewer knocks at the door. The intervi^.er makes only 
^ e c all at each sample household , the time of '“ill being considered random 
°^ t hin interviewing hours. If the household is available, the desired 
•^formation is collected and it is also a^ked whether the household was 
at home on the previous six days at the same time. This information is 
used to estimate p, the probability of availability at home. If the house¬ 
hold is found to have been away, no information is collected. 

' Assume that a wr simple random sample of n households has been 
selected. Then the ith selection can be used to give an estimate of Y as 

pi = if the household is available 

Pi 

pi = 0 if the household is not available 

where pi is an estimate of p,. Assuming that the at-home probability is 
estimated from s moments, we note that pi is a random variable taking up 

the value j/s (j = 1,2, . . . , s) with probability C : D Pi 3-1 (1 - Pi) 8-3 

if the sample unit is available. Hence, for a specified unit, the expected 
value of pi would be (given availability) 

Vi J (j/s)- 1 (* ^ Pi 3_1 (l - p<)*~ 3 ‘ = ^ (1 - g< f ) 

Qi = 1 - Pi 

e ^ %)yi(x ~ 9i#)=? a ^ 

At the same time 


Thus 


where 


Hen 


ce, 


Pi = j Or 1 (®.) vtor* 

E y‘‘ Pi - I - #■>] 



184 


Finally, let 


Then 


and 


M = Syi/n 

*<*) = ? VtH 

vm = i F(&) 

^ ~~ Syj/ri ) 2 

n(n — 1 ) 


SAMPLING Theory 

(8.24) 

(8.25) 


the author iii a Wfu a , ° . , followin g results were obtained by 

technique was applied Part ° f Beirut > in which this 

made l^Zdl^on ft t0 this method > f ^her calls were 

The column “fot caU adjurt^-TnXabl^ i° ° bta “, C0 “P lete re3 P°«^. 

b -—“»"»• w-ir “‘is™ sisr ““ 

Table 8.1 Estimates based on recalls 


Estimates in %, based 


on 


Item 

Unmarried 
Literate 
Earner 

Nonearning dependent 
Having refrigerator 
Having servants 


First call 

45.5 

48.1 
24.7 
52.0 

61.4 

35.5 


First call 
adjusted 

59.5 

75.4 

33.5 
65.9 
63.3 

36.1 


All calls 

61.7 

61.7 

34.2 
65.1 
65.9 

42.3 


Further reading 

See Exercise 74 for another 

person in the sample and information rf 8 ca Hs are made on ea< 

at least once. collected from those who are availab 

2 * Bou nds for the bias due to rmni + . 

a population proportion are derived* irTlS^ ^^7 °Bi ect is to obta 


procedures 

nonre8 P° n dents, so, 

fa0n from ^ose who donotcoo P e»te s'* “ *° ® et so ™ kind 

ate - Aeomparisoncan then 1 



IHONSAMPUNG ERRORS 


*itk the r “P° ndente °! tde 3ur ™y to assess the probable effect of the 
exclusion of n>,nrrapondents on the results of the survey. For example, 

it may be found that the results of the survey relate to 98 percent of the 
population, two percent belonging to the nonresponse stratum. If it has 
been possible to collect some information, such as the size of the household 
fl0I n die nonrespondente, comparative figures on the average size of a 
household for the two strata would be useful in interpreting the results 
The situation becomes more serious when the object is to estimate totals 
on a regular basis. It is inconvenient to publish differing totals simply 
because the response rate has changed during the two periods. A practice 
garerafiy followed is to make substitutions at random from the completed 
schedules paying due regard to certain known characteristics of the non¬ 
respondents. Tins will help reduce the bias involved although no method 
can eliminate the bias altogether. (See Exercise 81) 

Further reading If the respondents are unlikely to cooperate in the 
survey because the question asked is too personal, it should be possible 
to win then cooperation by asking them to furnish information on a 
probability basis only. This is discussed in Exercise 94 


8.13 SOME EXAMPLES OF SOURCES OF ERROR 

For the benefit of readers not connected with dnto 

consumer expenditure by the method of interview It h^ f°" “! 

acts: 

so-called “conditinni™” « ;1: P A thlrd source of error is the 
viewed todKM 1 , °“ e m f mber ° f a household is inter- 
‘ionsmightconHW. Ti! h he f ° Uowmg da V’ the intervening coDversa- 
of theirfcf dt ? ‘be responses on the second interview. The presence 

Mother examnU *** ^ pecuhar to ex Penditure data only. To take 
For this numi 'f Uppo8e data are to be collected on food consumption. 
Purpose the households selected at random are asked to maintain 



116 


SAMp UN Q THEo 

account books in which records are kept of the food ** 

households will refuse to cooperate, and this will „“ n " Umed H anv 
character of the sample. If the interview method is,!'!,. tte 
wdl be much less nonresponse but many households win , ‘" S , tead . ‘here 
they consumed, especially when they happen to be fa J * know wh »t 
grown foods for which they can give no numerical «,T® US ‘ ng hom e- 
alternative method of weighing all food consumed istT If the 
types of errors come in. As a matter of prestiee th! COns ,' dered . other 
may begin to consume more food than is usual and ^ * h ° Useho1 * 
food, at least for the first few days. The „ 0 rm„, m ° re exp6 ™ve 
household is disturbed. Altogether it mav be f , c ® u *' se od > n the 
difficult task to collect usable data ’on food consumm° X meedingly 
hold surveys. We shall give another exa m nl 0 f 1 ° n throu S h house- 

The object is to estimate the total productionTa m'h of (i agriculture ' 

sample cut (or plot) taken at random from ea-h field7n * Tf”* a 
in the sample Owimr tn e , he d ( or Parcel) selected 

parts of the P same fi7d g and “ “W* 

desired to keep the size of the cut small BufH* C< 1 eC0n0my ’ “ 18 

Su r me ’ 

usually offered is the “boundary"'Therms eX f nation that ls 

c r b r;r igator to “ 

b“ iXiT ° r Penmeter of the ^mple cut. Naturally this effect 

explanation viv ess important as the size of the cut is increased. Another 

unconsciouslv + en ^ S + * * m ° Ca ^ ng sam ple cut the investigator may 

the random n -Tl t0 faV ° r fertile patches ™ the field by shifting about 
me random point to some extent. 


REFERENCES 

Hansen, M. H., et al. (1951). Response errors in surveys. J. Am. Stat. Ass 

simple response vIril^X!Lw7 ■ interpr 0 etaUon of « ross differeoces “ d 
t, ontnbutions to Statistics, Calcutta. 

ffish 7' H °' <1946) Discussion of Paper by F. Yates. J. Roy. Stat. Soc-, 1 

h °m<k, J A Jm. StohA m«. ( i 1 4 9 54) ' Response errora in estimating the valw 

MahaUn'obis P c h juan -d 

t Statistical Institute “ sZ^Z^m SamPli ° g U> 

■ data, from houwhold ^rTera ' ^ A A ?" dy ° f 1681,01186 6rrora “ 6xp6lldltl 
V , ' * j r Stat. Assoc 59. 

v ‘ . > . * .. v. * 9 


n0 NSAMPLING errors 


1S7 

Politz, A. N. and W. R. Simmons (1949 I9*m .. 

ftt homes” into the sample without call-backs J t0 get the “ not 

_ • Am ‘ Stat. Assoc., 44 and 45 

D ' (1965) ' Farmers reporting at the Census. Sankhya B 27 

The Pr0blem 0t ** "« “ yield surveys. 

^rNewDelhi SamP ' ing ‘ he ° ry ° f 8 “ rVeyS W ‘ th »tions. Ind. Sec. A„r. 





CHAPTER NINE 


OTHER DEVELOPMENTS 


9.1 INTRODUCTION 

A number of important topics were not included in the previous chapters 
at their proper place because they might have retarded the flow of dis¬ 
cussion. They will all be discussed now under the following headings: 
variance estimation, estimation for subpopulations, the best linear esti¬ 
mator, the method of overlapping maps, two-way stratification with small 
samples, the performance of systematic sampling in different situations, 
the method of controlled selection, a general rule for variance estimation 
m multistage sampling, sampling from imperfect frames, and sampling 

inspection. 


9,2 VARIANCE estimation 

a ii the sampling systems discussed in this book, estimators of popula¬ 
tion characteristics (such as means, proportions, etc.), and their variances 
a nd variance estimators have always been given. But the question o 




' 190 

SAMPL 'Na T HE0» v 

the stability of the variance estimator was left untouched TV 
becomes important if the variance estimated from a sampl ■ questio n 
for comparing one method with another, if it is to be used for U8ed 
the sample size needed to achieve a specified degree of precisio^-* 118 
to be used for making a firm estimate of the precision actually *»'!“ * 
m the survey. We shall agree to judgs the stability of a variant ? 
mator u by its coefficient of variation, CV(u) = c(u)/B(u) w.I 
by proving the following theorem (Raj, 1958). * begm 

Theorem 9.1 

Let *i, h, . . . , t k be k independently and identically distributed random 
variables and let u = 2 (U - lti/k) 2 /(k - 1). Then 

CV‘(u) = ~ (* - 3)(t - l)-‘ (91) 

Jc 

where (U) = M^{ti)/Mz 2 {U) and CV 2 (u) denotes the square of the coefficient 
of variation of u, M standing for the central moment. 


proof Let Vi = ti — E(U) so that E(v t ) = 0. Now 

2(vi - v) 2 2 Vi 2 


u = 


k - 1 


k 


- 2 


X X m 

\ i>i 


k(k - 1) 
4 


-hi+ hi l £ W) + 11 * (wV) 

smce other terms will have zero expectation. Hence, 

*<«*> - h l M ' (u)+ h{ 1 + why) ? I mu)m 

or V(u) = | M t (u) + ~ k '~ - + - M^U) - &<•*> 

« k (lc - l) 2 _ 3 n 

Hence, CV*(u) - i&( fc ) + ^ L ~ 2 ^ + ? - l [&«<> ' jP^J 


Corollary 

8 ^ m P^ e random sample of size n is selected aud the P 
total is estimated by t - Ny, we have 

nt> = *>v (S ) Vlt) 05 LZ-mszM - • 

n(n - 1) 9.1 

Since the J/’s are independently and identically attributed, T h 


0TH6 r 

gives 


DEVELOPMENTS 

(Hansen, et al., 1953) 


191 


fitly) - 


C Y*(u) = 


n — 3 
n - 1 


n 


(9.2) 


Thus the precision of the variance estimator depends, apart from the 
mple size, on the 02 of the distribution of y. If the parent population 
? a ^ked, giving a high value of 0 2 , the variance estimator will be of low 
1S Vision.' Given the value of fc(y) for a population, it is possible to 
P r , * m (9.2) the sample size n for which the variance estimate would be 
tisfactory (having a coefficient of variation of, say, 20 percent or less). 
Ir case the object is to estimate a population proportion P, 

fitly) = P-K 1 - P)- 1 ~ 3 
If 1 A n — 1 

Hence CV 2 (w) = - _ p) n -\\ 

If n is large so that (n - 1.5)/(n - D = 1, the sample size is given by 

1 


n = 


p(i - p) 
CV 2 (w) 


-4 


Corollary 

f we are dealing with wr pps -pling the 
eindependently and identically distributed. Dmotmg by 
Later BW* ~ (1 /n)SyM/n(n - 1), we have 

ftJn ,/od - (» - 3)(n - 1) _ (9.3) 

CV ! («) -- n 

'hue the stability of the variance “depL^onriderably from 

>urth moment of »</*<• H the ratio of y to* dep ^ ^ ^ ^ the t k , 
lie average for some of the uni s, y/p extreme values o 2//P 

hich will inflate the fourth moment Even 
lay bring about a very high value o 0s- 

2.1 VARIANCE ESTIMATION IN STRATIFIED SAMPLIN^ ^ ^ r ^ Uce . 

et there be L strata with «a units selected from str 
lent with pp to x. Then 

_ ^ 1 c V* - > 


M2 


sampling 


Theory 


(9.4) 


An unbiased estimator of V(f) is given by 

u _ V S(z hj ~ 2 h y 

4 n h(nh - 1) 

By Theorem 9.1, we have 

V(M) ~ ? n?~ ~ 3 H»» ~ 

Since E(u) = 2(l/„ ^lf, W> we have 
CV! (“) - 2 ~ 

„ . [I; "■<«>]' 

CV,(U) = i iSS ~ 

T o examine a ; 2 (**)J 

sa-* ctr ] r ere be l = 100 ^*- 

^ -*c a cr of r in «* - jk 

CV 2 (*u) = if l8 ^^3\ 

If the, a- ^ \ riTT) = 

'^tionieno^ ^ J 


.014 


■^2(«fc) 


te<bh) = 3 a 4 


CV 2 ( W ) = J/o & - 3\ _ 
There is a • AjX# \ _ j J 

V(f) in stratified . 

fs : 

^dom variable S w ® have fe indenend ectl0n each stratum : 

y ‘ ' *V*% = H r e - «SSSt^K- 1 “ally diet 

^ an d the population total i 


The 

(».4), 


(».« 


u * P(«) = 2 <h ~ ?)» 
"fc“*ven by(95 .. 

^ Ce8 ««aottobeeS[l t0 Calcul “te than that given by 

ted within strata. From Theorem 



0N* ■ >EVEL0PMENTS 

its precision is given by CV ! (u) =. ( 1 /Mro 
«the number of strata is large, the U will be normally dw~k 3)/(i ~ W 
:„s from the central-limit theorem (Sec 1 12 d ‘ 8tnbute d- This 

£* gives CV>(«) = m ~ 1). Thus the precision *«> “ 3. 

Ltor in this case depends entirely on the sample siae with * Vatlance ««- 
«there are two units taken per stratum. CY*( U ) = 7 ^“ 0ne stratum - 

0.014 and 0.02 when the more laborious estimator O.tHsuZ" r m ^ 
the first selection from each stratum the first mteroonmVl , Calhn * 
etc., we may state the following result. P atmg 8u ^ 8am P le » 

Theorem 9.2 

nsrS'^t irir;-— 

«*' *"*** “ “* * “■»««» •*_*( iJZZZ c 

r -U -i*- 


V(?) a u = y fe ~*) 


A .*(* ~ 1) 

fc - 3 


CV 2 (u) = — 


0 2 (k) - 


*-1 




i/ the number of strata is large, fa(U) = 3 and CV 2 (w) = 2/(fc - 1). 

Son^ T precision of the vajianoe estimator depends 
y on the number of degrees of freedom. 

further reading See Exercise 20 for sample allocation to strata in order 
to have a reasonably stable variance estimator. 

9,2,2 THE NUMBER of degrees of freedom 

iudetfl etiolation theory the stability of a variance estimator is 

is tVta u" 6 num ^ er degrees of freedom. This is so because normality 
the basic assumption made. 

an Ce p , * is not necessarily normal, the stability of the vari- 

§o mG eS lr f a ^ or * s a function of the /3 2 of the distribution [Formula (9.2)]. 
w ^ich th ° rS ^ aVe g * ven ^ orm ulas ascribed to Satterthwaite (1946), by 
th 6 nUm ^ er degrees of freedom can be calculated approximately 
e sam Ple design used is stratified simple random sampling. The 


w SAMPL, NG theory 

reasoning appears to take the following form. The population total' 
estimated by ? = JJ iV,y» and the variance estimator u is 18 

t 

V Ni(Ni -rii) v a 

u = l —*—* = 2 ,** 

Assuming normality within strata, it is easy to see that 


and 


v(u )=2 y 

i - 1 

CV«f«) = 2s to. % iV(».- - 1)] 

(S»«r 


We can say that the variance estimator is based on the number of degrees 
of freedom given by (W)V%,V/(» ( - 1)]. Just as we can with 

^ m W !" Ch * he nU “ ber of de S rees of freedom equals 

fhf I ' B r h ‘ S , reSU t muSt be * nter P r eted with great caution In 

!nt I 1186 ”,? ° n ° rma ity - the number of de «recs of freedom as such may 
not give the true picture about the precision of the variance estimator 

phngfsu^dfoTtw 6 ’ tb ? Situati0n in wl >ich with-replacement pps sam- 


CV'M _ ? ‘- («» - 3)(n» - 1)-*] 


(9.6) 


A 

Thus the stability of flip 

within strata, the variances e ®. tl “ ator de P ends upon the ftW 

simple quantity to calculate ’ i,.* f . tbe sam ple sizes n». There is no 
stability Of u . ^ d * us regarding the 

variance estimator is based • \ num ^ er of strata is large and the 
almost exact relationship hA+° n m erpenetra ting subsamples, there is an 

and ‘he number of Stabi,ity of the variance estimator 

Srees of freedom involved (Theorem 9.2). 

"w: dofran --- 

«non^ 10n k° f sta n d Md V erromi! he | e ! ti * atio n of several characters, the 
of efficien *** cslcula ted more auirH^ 0 " 0118 Process. Methods in which 
ot such a^ bein 8 increasindv y W j tb tbe at tendant risk of some loss 
UB8 ‘ ra tifiedT th0d in stra tified 8 f a USe ?' Theore m 9.2 is an illustration 
for estinut.: **!, ® u PPose a simnl mpBne ' ^l now consider a " 
* the populate ? random sample of n units is selected 

mean, if the customsry vananC e esti- 



DEVELOPMENTS 


OTH6 r 


m ator «* (Sec- 3.3) is used, we have 


CVV) = i W,(#) - 3 ] + -!_ 

71 n — 1 


(9.7) 


where n is considered to be very small relative to N so that we may assume 
wr sampling. Instead of calculating s 2 , we may divide the sample into 
jfc mndom groups, each group containing m = n/k units. Denoting by 
k the sample mean based on the ith group, we have M = XU/k, and a 
quicker estimator of variance is provided by 

. y ft - 1 2 

^ k(k - 1) 


Assuming wr sampling, we have, by Theorem 9.1, 


w) - i(«« - |ef) 


Using the fact that U is the mean of m observations on the variable y, a 
simple calculation shows that (Raj, 1964) 



+ 3 (m - 1) 
m 


Hence 


CVV) = - 3] + -^- 

n k — 1 


(9.8) 


A comparison of (9.8) and (9.7) gives 


ev 2 (» 2 ) - cv 2 (s 2 ) 



>0 

n-lj (n — l)(Jfc - 1) 


Denoting by a and a! the coefficients of variation when no grouping is 
made and when k groups are formed, we have for large n: 



* 


Thus with a small number of random groups the precision of the variance 
estimator may be extremely low. 


9 -2A VARIANCE ESTIMATION IN WTR SAMPLING 

itWas a simple matter to get usable results on the variance of the variance 
estimator when sampling was done with replacement. In case of wtr 
n^pling it is not difficult to put down formal expression# for the variance 



" SAMPLIM « TH t0 »» 

of the estimator of variance, but their simplication for actual 
herculean task. Only two sample designs, for which the * 8 a 
been able to get simplified results, will be considered. The ^k 1188 
important, since it is generally believed that variance estimL\° • * 8 
hazardous task when sampling is without replacement with unwi ^ 18 a 
bffities. Only the case of two sample units within a stratum 
discussed, but lucidly this is a situation of considerable nraetie. 1 ” 
Themsults for one particular design (Sec. 3.24) are given 

Theorem 9.3 

Let the first member of the sample be selected with probabilities 

P< (*' = 1. N) 

t °™ m ™ ?.*** x ‘ °f «»«*. the second unit being wleM 
wm pp to Xi of the remaining units. Then t, = y,/ Ph 

ti = Vi + (1 - p,) ^ 

Pi 

estimator being mbtased eshm ators of the stratum total, the variance 

„ - (fir tiY 
4 

««d the coeffioient of variation of u is given by 

E(u 2 ) 


1 + CV ! (u) = 


where 


[£(«)]* 

E(u) = V(t) 

E(u ’ } = m ? (1 ~ pi)1 [* (J - y ) 1 + e - n £ 

- 4(4.. - Y>) Vi + (4.. - Y‘)p] (9- 9) 


Now, 


PROOF 'Thu i 

° y rGSU ^ *° P rov ed is (9.9). We have 
(« 2 ) = E^) - + QEitiHi*) - ±E(hti) + 


E(h A ) = 


= An 


where g, den . " *«[*(«.*)] 

e conditional expectation given the fi r9 ^ 


sel 




oTH£ r developments is; 

Bu t UUii) - = h , I r - Hence 

B(t ,•(,) = rB(ti») = r 2 (ri) = - 

proceeding on the same lines, it is not difficult to establish that 

E(tiHz 2 ) — (A 2 i) 2 — A 42 ~ A a A to + 2YAz\ 

Eihtt*) - A,,2(1 - Pi) 2 y* + 3A 2 i(A 20 - 2 pty*) - An - A 42 + 3FA» 0 
and E(t t A ) = A 43 2p 4 (l - p*) 3 + 4A 82 2pi(l - pt) 2 ^ 

+ 6A 2 iSp<(l — pi)yi 2 — A 42 — A 4 i — A 4 o + 4F2p»p» 3 

After considerable rearrangement of terms we get (9.9). ■ 

Remark In the notation of Theorem 9.3, the expected value of the 
square of the variance estimator u in wr pps sampling (corollary, Theorem 
9.1) can be written as [A 43 + 3(A 2 i) 2 — 4FA 32 ]/8. 


9.2.5 VARIANCE ESTIMATION IN RANDOMIZED PPS SYSTEMATIC SAMPLING 

With this procedure (Sec. 3.18) an unbiased estimator of the stratum 
total is t = yi/vi + yj/irj with 

V(t) = 2'(«r. - »„) - sY 

Vi »// 

an unbiased estimator of the variance being 


V = {rue,- ~ Try) 



(9.10) 


Hartley and Rao (1962) have obtained an approximate expression for tt# 
when the units in the stratum are randomized before selection. Correct 
to 0 (N~ 4 ), they find that 

T <i - K*i*j{ 1 + Vika + wy) 4- KV 2 + *j 2 + *i*i) 

— HW + %(tt,- + iry)] + H e St* — HSt] 

m 

where Sk = 2rr t fc . Substituting this value of Try in (9.10) it can be shown 
(Raj, 1965) that to 0(AT 3 ) 

£(t> 2 ) = F - 3G (9.11) 

whef e F = B n - 2YB n + %(R 2 i) 2 

G = YiBn - B^w + y 2 YStB n - YB n - %St{BtiY + K*iAo 


and 



If 


we use the simpler, but biased, variance estimator 


SAMPLING 


it can be shown that, correct to 0 (N*), 

E(w*) = F + G 


THEORY 


(9.13) 


where F and G have been defined before. If the estimator w is Use( j • 
wr pps sampling in which t, = 2p„ we have already noted that in this c ** 


E(w 2 ) = F 


ls case 
(9.14) 


An example will now be given to show what these formulas are ex 
to give in some practical situations. The data are taken from **»£**?* 
Horvitz and Thompson (1952), in which the object is to estimate by 
sample of two blocks the total number of households in an * fr ° m a 
20 blocks. The x { are eye-estimates of the number of hn u < !° ntainin g 

9.1) and the * are the actual number of Toeholds ^ (Table 

W ‘" d — fed number of households 


*7 

14 

12 

O 

21 

24 

6’ 

22 

25 

7 

27 

23 

8 

35 

24 

9 

20 

17 

10 

15 

14 

14 

47 

30 

16 

27 

27 

16 

25 

26 

17 

25 

21 

18 

13 

9 

19 

19 

19 

no 

12 

12 


Table 9 2 o-i 

" WlUCh tl>e I^UttbT'r fW the modified <*■ 

Table , 2 Cn* ^ Ch a^ d ‘0 34 , 

_____ 0r &ml data ~~-—-- 


Sat npling Vari <mc e 

scheme he 

estimated 

—( 000 ) 

8im Ple r andom ' ' 

wr ppg d ° m 17,122 

* tr Pps: 3,247 

6,1 j»ato r u 

( g 3) 3 ,045 


Variance 
°f variant 
estimator 
( 000 ) 

7 °4,319 

21,259 


*6,748 
1 5,862 
22,330 


Modified data 


1.55 

1.42 


1.34 

1.32 

1.56 


Variance 

to be 

estimated 

( 000 ) 

17,122 

9,435 


8,886 

8,848 

8,848 


Variance 
0 f variance 
estimator 
(000) 

704,319 

435,129 


483,437 

354,952 

456.844 



DEVELOPMENTS 


other HI 

^good measures of size) the stability of the variance Irflmato ^ 
^ deteriorate when one passed on from sampling with equal Lbabditt 
„ sample with unequal probabilities. With poorer measures o 2 
the variance estimators became less stable in the case of sampling with 
unequal probabilities. 


) 


,3 ESTIMATION for subpopulations 

Ordinarily data are required not only for the entire population but also 
for its subdivisions, which may be called subpopulations, or domains of 
study. For example, in a labor-force survey, estimates may be wanted 
not only for the total number employed, but separately for those working 
in agriculture by sex. Out of all males working in agriculture we may 
wish to know what proportion worked for less than 20 hours during the 
week of the survey, and we may be interested in comparing the average 
earnings of these with those working in industry, and so on. In these 
cases we are estimating totals, means, ratios and proportions in subpopula¬ 
tions. The question arises: Do we require new theory for this purpose? 
The general answer is: No, no new principles are involved. A probability 
sample taken from the entire universe would serve as a probability sample 
taken from the subpopulation provided that units in the sample not 
belonging to the subpopulation are assumed to have a value of zero for 
the character under study. This is like taking a sample from a frame 
which is known to include extra units not belonging to the population 
under consideration. If we make sure that such units are given a value 
of zero, we can make estimates as before. The only difference is that the 

number of units in the sample belonging to the subpopulatiori is a chance 
variable. 

1 ^ an exa mple, let a simple random sample of n persons be 
■ 60 . r . om a Population containing N persons. If we are interested 

(ami II! latl f ng earn ings of males, the females in the population 

ere ore in the sample) will be assumed to have ?/,• equal to zero. 

h this definition, Y g = Y y i} where Y g is the total of the subpopulation 
for i 

lation ^ enCe> ~ ^Syi/n is an unbiased estimator of Y g and the popu- 
i n the mean | S es ^ mate d by y g = Syi/n 0 , where n g is the number of males 
aQ d henctf?- ^* V - n n ° > ^e expected value of y„ is Y 0 = Y g /N„ 
have 6 ^ ^ = F 0 . Regarding the variance of ¥ g , we immediately 


N 2 


n 





SAMPLIHq tmc 

where OJ' axe defined as S,W, the only difference h • 
not belongmg to the subgroup are assumed to ha veTv ^“ g that units 
Conung now to the variance of y„ we find that givenl. ^ 


*•<*•) - r^g.) 


Hence 


\*» Nj • 

r<u-*w- j-V + igja, 


0 \YL / 

As an estimator of its variant , , 9 

speaking, we should apply the’methods of & r " n ^ N ^ 2 / n «- Strictly 

■nator Of the meanj ^ J >* the r 2 of L ? timatio “ to estf. 

aze of the group N, is known it is variablea - If the 

tio 7^ g v USe ° f this inf »™ation A better 7“^® higher preci “ 0 “ 
tion total Y 0 would be t = at ‘ \T bU estlmate of the subpopula- 

methods given in Chap. 5. C ° U ' d be develo P«l by 

proportions in the subpopulation aPply to the “«“ation of 

CoLr P0PULAT,0N ANALYS,S F0R ^HE R DESIGNS 

of n ‘ units haibin” elitaUrZ’"f ° * Stmta f ° r wbicb a random sample 

‘“ g /‘ “fits. The 5ftf a“ f- ( * 7 ‘ ’ L > 

£ 2 y*> where y ih i s z ero fo e P p atl ° n for a ch aracter y will be 

the subpopulation, a " ^ ^ “ the P *o not belong to 

*7 “of the form Y" 11 ®*® ,° f ** wil1 be Since 

Theorem 4.2 can be used fn f r .°P er ^ defined, the formulas given in 

variance from the sample p ° + i aim ? g lts varianc e or estimating the 

fiiven by ® subpopulation, the average would be 


w We takes the valm* i -f 
zero otherwise. A samni. *. unit belon gs to the subpopulation and 

pie estimate of it would be 



Vo = 


Vih/Uh 
* i _ 

Nh S Xih/n * 


o bls estimator has th* , t 

eo * 5,18 » 80 that no new 3 ?^ 6 ? Fm as tbe combined ratio estimator o 

no new formulas are required for its bias and variant 



DEVELOPMENTS 


201 


ofH£ R 

* which » psu's have been selected 
following scheme B of Sec. 6.6, and a random Bample of m . subunits ha8 

be en taken from t , n the tth psu. Defining Xii as i if t h e jth Bubunit 
in the ith peu belongs to the subpopulation and aero otherwise, and letting 

yu he *®° f° r the ® U , bu “ t lf J* do< ® not belong to the subpopulation, we 
immediately see that the subpopulation mean per subunit will be esti- 
plated by 

(l/w) S {Mi/S yn/rrii 


O-/ 71 ) S {Mi/ S Xij/rrii 


The bias, variance, and other particulars of this estimator follow readily 
from the theory developed in Sec. 6.8.2. 


9.4 THE BEST LINEAR ESTIMATOR 

As pointed out in Sec. 2.8, it is possible to consider very general estimators 
in sampling theory. Let there be a sampling scheme in which the units 
are selected with equal or unequal probabilities and sampling is with or 
without replacement. The sampling scheme generates the totality of 
samples s, each sample having a probability of p(s), J p(s) = 1. Let the 

8 

r Hi 

unit Ui be included in H t samples so that n = £ p(s) is the probability 

that Ui occurs in the sample. Further, let a t -(s) be a coefficient attached 
to Ui when it occurs in sample s. A fairly general linear estimator may 
then be written as 

N 

i = SyMs) = V y t u i (s) (9.15) 

i-i 

in which a'(s) takes the value 0 if Ui is not in s and the value o»(s) if Ui is 
ln 8. The problem is to determine the coefficients Ot(s) such that t is 
unbiased and is uniformly better than any other unbiased estimator of Y 
(Godambe, 1955). The expected value of t would be 

E{t) -1*1 *(•)*(•) 

i 8 

In or(ler that t be an unbiased estimator of Y = Zy i} the coefficients a,(«) 
must satisfy the conditions 

n< 

= 1 (i =], 2, N) 


( 9 . 16 ) 



202 SAMPLING THEORY 

We shall now determine the a,(s) such that 7(0 = £ t 2 p(s) - 72 j g 


9 

minimum subject to the conditions given by (9.16). For getting stat’ 
ary values, we differentiate with respect to a,(s) the function n ~ 


Hi 


^(0 + 2) j a,(s)p(s) - 1] 


and equate the resulting expression to zero. We have 2 y ( t = m,. 
means that the estimator t must be a constant whenever U is in tv , 8 
sample. And this should hold no matter what y’s are. Obviously 
enough, it is impossible to choose the coefficients *(«) 8U ch that thl 
resulting estimator is a constant whatever the y* s be. This proves that 
a uniformly best unbiased estimator does not exist. An admissible 
estimator, however, is obtained in the following theorem (Roy and 

19 !?‘ an admissible estimator we mean one for which 
can be shown that there is no estimator which is uniformly better 


Theorem 9.4 

Under the sampling scheme of Sec. 9.4, lei there be the estimator 


f* = = 2 . (s) 


(9.17) 


exist a t belanaina\o ® otherwise. Then there does not 


puoof We have 


Hi 




^ __ V Vi X P(«) VA 


so that t is an unbiased estimator of Y. And 

Now, let there ^ = 

and for which V(t*) — * ~ ^* a <( s )> which is unbiased 

w K W > 0. Since 

+>, • ~ Cov (a(.,a'.) = 2Sw-5 - 

the inequality 22W( 8 * _ , v . * 

5 «ll be positive definite” But ~ ^ mils ^ hold, which means that | 


i?< ~ S " = F(a?) ~ y K) = *(a«) - 

= -[^(a; 2 ) - A;(a* 1 )] = -Eg 




DEVELOPMENTS 


203 


OTHER 


since 


£(o*o') = f 

^ T< 



Hence the matrix ||8# 8»/|| is not positive definite, which leads to the 

result that there is no t which is better than t*. u 


Corollary 

In sampling with replacement with probabilities proportionate to size, an 
admissible estimator of Y would be £ («)/[-(1 - + 1]. Note 

I 

that this estimator is based on the distinct units in the sample. The 
reason that the admissible estimator is not used in large-scale surveys is 
the complexity of its calculation along with its standard deviation. 


Corollary 

In wtr sampling with unequal probabilities, the estimator $(y*/ir«) of Sec. 
3.19 is admissible. 

Remark Koop (1963) considers estimators more general than (9.15). 
He makes the coefficient oj(s) to be attached to y> depend on the order in 
which y% is selected in the sample. We know, however, that the admis¬ 
sible estimator makes no use of the order of selection. 


Farther reading 

1. We have restricted ourselves to unbiased linear estimators for esti¬ 
mating population means or totals. It does not, however, imply that, 
given any nonlinear unbiased estimator, there exists a linear unbiased 
estimator which is uniformly better (see Exercise 97). 

2. There does not necessarily exist a linear unbiased estimator for every 
general sampling design (see Exercise 91). 

2- If there is further information available (for example, the value of the 
coefficient of variation of a population), it is possible to improve upon the 
admissible estimator of this section (see Exercise 90). 

4. For a formal definition of sample design and its relationship to a 
sampling scheme, see Exercise 85. 


9 - 5 THE method of overlapping maps 

fo multipurpose surveys involving the estimation of several characters, 
to is usually found desirable to select the units with one set of probabilities 
for estimating one group of characters and with a different set of proba- 



m sampl, Ng THeor< 

bUitiee for estimating another group of characters. For example, u, th . 
National Sample Survey of India, population is made the basis for 8 ele c 
tion (of psu’s) for the household enquiry and area is made the basis of 
selection for the land utilization survey. Thus there are two overlapping 
maps for the universe (Lahiri, 1954). A problem arising in such a situa¬ 
tion is that of designing a suitable selection procedure such that the 
sample units (psu’s) for the two types of enquiry are identical or near to 
one another. Such a procedure will greatly reduce the cost of operations 
in the field. Let there be a stratum containing N psu’s. We are required 
to select a pair of psu’s, one with probabilities a\/G, az/G, . . . , a N /G 
proportional to area and the other with probabilities &i/Cr, bz/G, 
bs/G proportional to population. Let c, ; - be the distance (in some sense) 
between the «th area psu and the jtb. population psu. Let x^/G be the 
probability with which the corresponding pair of psu’s is selected (see 
Table 9.3). The problem is to find Xij such that 

X Xi > = a ‘ X Xii = bj 1^ = 1 h > = G x * ^ ° 

i ♦ i i 

and Z = XZcijXij, the expected distance, is minimized. Stated thus, this 
is the familiar ! ‘transportation problem” (Koopmans, 1951) in linear pro¬ 
gramming, which may be solved by the simplex method (Raj, 1956). 


Tablo 9.3 Probability mass to bo distributed 



The SELECTION PROBABILITIES-' ’ 

problem maps I s * useful.application in the foil 0 ® 

measures of size r t j ^ rorn a stratum a pairig selected with PP1° ^ 
26 *• In d «e Course we get better measures of sise oi 



other developments 


psu’s say ' ° n th % lat “‘ ceMUS )' The problem is to make use of 

the new measures of s,ze for the new survey but to change as few sample 

psu'e (° ne each s “> aS J! 0s51 “e. Thus the problem is the same 
_g that discussed id Sec. 9.5. The object now is to maximize the proba¬ 
bility of getting identical psu’s at the two surveys. This is the- 

minimizing ^ = where the cost matrix (c#) is given by T 


same as 
Table 9.4. 


Tabla ^ 


Cost matrix in Koyfitz’ problem 


Psu 

no. 


New survey 
2 3 


N 



The optimum solution will consist in putting as much mass as poribk 
a Jlgonals. that is « in ^I ttah add 

by 


£ min (di,bi) 


G 


b t + b, + bz + a 4 +_g _ s 
G 


, table 9.3 associated with this 

must be pointed oiit that the p r0 a i ^ ^ ^ . . .) are known, 

jblem cannot be made until a er ^ to a > 3 Let the select 

r the initial survey a psu is selected ™ PP size s are avaUable, 

sinal selection is Ut or Ut, it 1B . r ® . y w ould be («i 1 . (,ave 

Produced, The chance of re,^% ja „ and so on. » « „ t0 
original selection, that of Ut “ * rejected, the net b b yjties 

ermined that V. (or V, or 'J u be made with proba 

ose between U « and U *• * 


SAMPLINq 


Theory 


proportionate to 64 — 04 and 65 — 05 respectively. It is easy to 
that with this procedure the probabilities of selection for the new° 
are proportionate to bi and that the chance of getting identical SUr , Ve y 
(b l + b t + b, + a, + a i )/G. psUsi » 


9.6 TWO-WAY STRATIFICATION WITH SMALL SAMPLES 

In some surveys it may be considered very important to employ two 
criteria of stratification (like altitude and size of locality) which may give 
rise to a large number of substrata. But the total sample size n (the 
number of localities) is not large enough to provide an allocation to each 
cell of the two-way table of substrata. The problem then is how to 
design the sample in this situation and make proper esti mates. A particu¬ 
lar case of this problem, when the number of subclasses for each of the 
criteria of stratification is the same, was considered in Sec. 4.10. A more 

foAo? 1 t ' eatment of the Problem will now be presented (Bryant, et al., 
o fix ideas let there be three altitude groups A» and five size 
groups Bj, so that there are in all 15 substrata. Let the total number of 
sample localities be 10 only. The proportions of the localities in the 
r6 p a <- . e ® rou P s are -P*. (i — 1, 2, 3) while those in the size groups 
J ^ > • • • , 5). On the basis of these proportions, it is decided 

ave n, — nP,-. sample localities for the i*th altitude group and »./ = nP.j 
oca dies for thejth size group. In this example let the desired 

mirnKp ? ^,3 and 1,3,2,1,3, respectively. Denoting by the 

ensures that ^ '“ ““ /' j) ’ f 6 following 8am P ling 

10 Waivti ' v - n t\.n.j/n m In order to select a sample of 

select in !!u We . C ° nSt f UCt a s< I uare with 10 rows and 10 columns and 
8 a ran< iom using the Latin square principle (Sec. 4.10)- 

T,bl * T\ 2 U “ 3 "Ti*s“ , * c r n , M 

-J1L® 4 I 6 *M«. » 101 B,B,B.B. B, 


m 



2 0 JL 


0 

1 

0 

0 

0 

0 














“ EVEL0PMENTS 

rt- selected cells are shown as (V) in Table 9.5. Based on the numbers 

T Id nj lines are draW “ par ^ lel *° the sides of th c square to determine 
n* f° r eaC ^ su k s * ra ^ UI3IL Ui * so obtained are shown in Table 9.6. 
sDecified number m,- of localities is selected at random from the total 

Th ber Na in the cel1 (*»$• 

0Um Let y {j be the sample mean based on the n {j localities in this cell. 
'Then as an es ^ ma ^ e population total Y , we use the estimator 



(9.18) 


Given n,y, E(N i} m) = Y& But E{n i} ) = ni.njn. Hence 

E{?) = X X Y<i ~ Y 

i i 

Thus we have an unbiased estimator of Y. To develop the variance of t, 
we introduce the following lemmas. 


Lemma 1 

In Table 9.5 let u r , be a random variable associated with cell ( r,s ) defined 
as: u„ = 1 if the cell contains an (V) and 0 otherwise. Then E(u r ,) = l/n, 

V(Ur,) =' - - — Cov ( U T „Ur ,>) ~ ~ ~~ 2 

n 2 n 

1 1 

COV (u r „U T >,) = ~ ~ 2 CoV (u T „U r >,’) - ^ ^ 

Using the fact that the variances and covariances of n»y (Table 9.6) are 
the sums of the variances and covariances of u T $, we prove the following 
lemma. 


Lemma 2 

E(nn ) 
V(nv) 
Cov (nij,na>) 
Cov (n<„n»vy) 
Cov (n iu ni>j>) 


nj.n.j 

n 

nj.n.fin — w,.)(n — n.j) 
n 2 (n — 1) 

n i.n.fn,j'(nj. — n) 

w 2 (n — 1) 

ni. n.jni'.in., — n) 

n z (n — 1) 

nj. n./ni’.n.j' 

n 2 (n — 1) 


wmpun, Tht0 




Lemma 3 

Let w ^ SZCijWv where Ci * 8 are constants - Then 

V(w) = 2 J Cij 2 V(riij) + J 2 X ^ ov 

+ 2 X X CiiCvi Cov ( ' 7lii,n ^ + X X ,X X c °v (n ijin ) 


In particular, for ca = 7 0 (n t \»./) S we Aave 

KM = ^rri) IX ((£ - 0 fe - 0 v 

+ (i - „” ) K«(*. - n) + (i-£) K,(C, - r*> 

+ YdY - (Ri - Yii) - (Cj - Yij) - r<,]J 

* — VY Yu(— r«-^ft-"Ci + r) 

“»«(» _i)44 ”Wn.,- n,-. n. y } 

where Ri and C, are totals of the fth row and jth column in Table 9.6. 
With this background, we prove the following theorem. 

Theorem 9.5 

Assuming the sample sizes within substrata 8n ^’ m !!^ t /g jg) isgivenby J 
population corrections negligible, the variance of the estimator (9. W 




_ ia-io + r)’ 

Wi. n -j 


(9.19) 


proof Given w,, 


rxiuvf v 

E*t) - » X 2 K » K 2 (f) = »* X 2 (^>7 S “ ’ 


Hence 


E t V,(f) =«X2 


;\W 

ni.n.j 


n r nijY%i 

V l E 2 (Y) = w 2 7(w) w = 2, 2 n,w,- 




othEB 0 evelopments 
py Lemma 3, we have 

rM* - ll F «£ ^ - n " * - Jc, + r) 

-zhm^r/'-ZZv-izv+r) 

= » s (» - i) ^ («ST/ F * ~ .< 81 “ £ c< + y ) 

This proves the theorem. a 

u Owing to the dependence of the sample sizes n,y, the variance of 
T ' not simply a function of variances £,/ within substrata. 

. j£ on ly one-way stratification be employed, the estimator 
ould be ? = A 7 ’i.£*. with a variance of 


Vi Vi with a variance of 

WOlUti W - 

v-^v + y—y—(y tf - fi) ! 

L m 4 1 V ( . ^ «(. ^ < 

t ^ 

approximately. If the allocation is proportional, n, = «*•./*. Then 

V(f) = - [2ZAW + szjv«(r« - fOI 
v n 

, n c it ie nossible to write down.the condition that two- 

9 7 SYSTEMATIC SAMPLING , 

In Sec. 3.8 systematic sampling was performance 

basic methods of sample selection. Qf the varia te with the order 

of this technique depends on the rela p w ha u no w give a number 

in which units are listed in the population, 
of examples of this phenomenon. 

9.7.1 DATA EXHIBITING PERIODICITY of vehicles passing 

Suppose we want to make an es ^^ te We expect that traffic ov ^ 

over a bridge during a day, there be»f 

bridge exhibits periodicity dun “® little traffic- ttw 

very busy and others when there is very _ 



““ ——THEOHY 

an hour at random from the first day and examine the traffic over tw 

hour and subsequent 24-hour periods (i.e., take a systematic samnl 18 

hours over the month), sample-to-sample variation would be very i a ^ 

If the hour selected at random happens to be the peak hour, the sa 

will contain all peaks and this will produce a very high figure On th 

other hand, if the first hour selected shows poor traffic, all observati ' 

taken at this time during the subsequent days are expected to be ^11 

below the average, thereby producing a very low figure. Thus if th ^ 

is periodicity present in the data and the sampling interval k coinciH^ 

with the period, it will be unwise to take a systematic sample with thi! 

va ue of k. It is not uncommon to come across populations with periodic 

features Temperatures over a 24-hour period, sales of stores over a week 

and postal articles received in a post office over the week are some other 

examples of the occurrence of periodicity. One has to be sufficients 

acquainted with the data on hand in order to be able to decide 

samp m 8 interval if systematic sampling is to be used. There will be no 

such problem involved if a random sample is taken or dilW rand 
starts are used within strata. amerent random 

9.7.2 POPULATIONS SHOWING TREND 

- ttX'^ome ZZ L“: (f0r eXam J >,e ’ When -“holds 

gains can be achieved by takine a Wlth mCome) substantial 

even spread over the po^arion T S W «° r^ P ‘ e Which WiU «■" “ 
unit per stratum will be still better T f atlfied desi S n with one sample 
an example in which the y for a unit be Sh ° Wn the hel P of 

ik is P'^ed in the un verse "1“ on the - 

Let N = nk where k ^ ’ that is, Vj = j (j = t 2 m 

f^ 2 wiU be f °und to be The po P u ^ on variance 

stratum, the next k units f„" + * If the first k units form one 

aampkwXive S ob atUm W ° U ‘ d be T 8<> ^ 

a sam V* + ^ + A T(» ^ T+l 
V°thJT mHie samples wil1 beVj -Hi The varia nce of the means of 
random ^ Sai — eeri^eof.^—ng and 

Per stratum P and Ratified random 6 ™ ean ' n the case of simple 
stratum, we have rando “ sampling with one sample unit 

V, = 2-ni)W + i) , 

Th- v 12 F) = v (bl *») = — 

Th‘ S shows that in thi 1 ' 12n 

h ‘ S 0ase Gratification with o„ • 

n one unit per stratum is 



other developments m 

jjperior to systematic sampling which in turn is better than simple 
random sampling. Madow and Madow (1944) provide further informs- 
tion on this point. 

further reading If the population is monotone, it is shown in Exercise 15 
that centered systematic sampling is more efficient than random start 

systematic sampling. 


9.7.3 AUTOCORRELATED POPULATIONS 

We shall now consider populations in which there is higher correlation 
between adjacent units than between units further apart, the correlations 
decreasing as the interval between units increases. Cochran (1946) has 
investigated the relative performance of systematic and stratified random 
sampling for such populations. Due to the finiteness of the population, 
the model that p u ^ p v whenever u < v will not hold exactly with a given 
population. As a result, the model is assumed to hold over an infinite 
superpopulation, and the finite population at hand is supposed to have 
been selected at random from this superpopulation. In fact we shall 
assume that 

E{ Vi ) = M E( Vi - m) 2 = a* E{ Vi - p){y i+u - = Pu(T 2 

where p* > Pv > 0 whenever u < v. Now, for a specific finite population 
1/1> yi, • • ■ ,yir where N = nk, the variance of the mean based on a ran¬ 
dom sample of size n is given by V x = (1 - l/k)[l/n(N - l)]2(y t - F) 2 
and the variance of the stratified sample estimate (taking one unit per 
stratum of k units) is F 3 = £ £ (t/,y - F t ) 2 /kn i . Using the algebraic 

identity 

*1 <» - ? > ! = 11 <* = H (to - <0 - <w - ar 

N-l 

[JV(JV - 1) - 2 2 (N - u)p u 

u =» 1 

N-l 

'= W - IV [l - 2N -1 (N - 1)- y (AT - U)p u ] 

Hence 

EiV,) - (l - i) ^ [i - 2JV-HJV - l)-> | (AT — «)p.] (9.20) 
(* - IV [l - 2fc-‘(fc - !)-■ y (fc — u)ft,] (9.21) 




212 


SAMPLING 


theory 


Denoting by ft the mean based-on the zth systematic sample of ■ 
the variance of the systematic sample mean would be 

2(ft-?) 2 2n(y.-fy 
V * ' - k N - 

= total sum of squares - sum of sq uares within samnlp* 

~N - 


size 


n. 


Hence 


or 


NE<y,) = (JV - W t 1 - 2 N-\N - 1 )-. Y (A r _ 

14 = 1 J 


W = 1 

+ 2k(n)- Kk - 1 )-. Y (n - «W] (9 . 22 

ShaU first p“veTrir«™g d ,^“’ 0ng (0 ' 2 °)> (9-21), and (9.22). W 

Lemma 

V ^ * ore Positive and Pi _ 

condition that L =V« \ . ‘ +1 

4 > 0 ('V a . 

i=l 2 1 = 1 * 

» 4 • • . , m — i 

9 * • 


p v - 

* - o, a necessary and sufficie 
°) is that Ai = l ai > 0 forever 

J=1 


PROOF 

£ = S i“i + Ma 1 + „ l)+ . . 

that L > o ^ P° sitl ve, the conditio } 

make ^egative'by^T ^ * the coefficf’ r 0 is sufficient to establ 
Pr0T * that theco ndl 7“ 8 *< Positive *T n , ° f 5 ‘ * negative, we c 
nhs lemm a ° i s ne^rf the other aero. T 

d ‘ P prov ing that 

*" ■ - - „ h , y „ _ 


DEVELOPMENTS 


213 


OTHER 


is a monotonically increasing function of k. We have 

L(k) — L(k + 1) = —2 [k(k 2 — l)] -1 ^ (k + 1 — 2 u)p u 


u=X 


Since 


V (k 4" 1 — 2u) = 0, the lemma applies. As 
^ (fc + 1 — 2-u) = — i) > 0 


u = l 

.. follows that L(k) < L(k + 1). This establishes that E(V,) < E(V i). 
Thus the average variance based on the stratified sample is less than or 
1 fhaf of the random sample. No such general result can be proved 
e ? U ]lf t i. e efficiency of systematic sampling relative to random sampling, 
X tetherrestrictions are imposed on the correlations We shall 
£ fact prove the following theorem. 


Theorem 9.6 


If 

and 

then 


Pi > Pi+ 1 > 0 *‘ = 1,2, ... ,N 1 

, 2d > 0 i = 2, 3, . • • , N - 2 (9- 23 ) 

fc 2 = pi-i + Pi +1 — 2 P' - u ’ 

E(V 2 ) < E(V») < E (Vi) 
i „ Q v was developed as [total sum of 

proof Just as the vananC ®. * temat ic samples)/n, the vari- 

squares (s.s.)]/N ~ (average s.s. _ (average s.s. within stratified 

ance V 3 can be expressed as (to f 1 the theorem, we shall prove that 

samples)/^. Hence, in order to prove the 

E (average s.s. within systemacm samples) ^ stratified samples) 

, „ a n , it is easily seen that 

Now given a set of values a\, 2, • • > 

■ n nver the different pairs, so that 
where E' denotes averaging 


^ (Oi - 5) 


_ N , _ ” _i E'foi - a >) : 

2 


• fhe sample from the *th and/th 

Denoting by #, and *, the sample would be grven by 

strata, the average sum of sq 


214 SAMPLING Theory 

[(n - l)/2 \E'{yq - i/y) J , where E' is over different pairs of strata. Con 
sider now a fixed pair of strata with l — i = u. In the case of the s 
tematic sample, the elements in the two strata are always at a distance of 

ku. Hence 15 (yy - y y ) 2 = 2 <r 2 (l - p ku ). For the stratified sample there 

are k 2 possible pairs of elements from the two strata. One pair is (ku - 
k + 1) elements apart, two pairs are (ku - k + 2) elements apart, and 
Hence for the stratified sample we have 


so on. 


r i V 

I 


i\)pku+i 


Thus, to prove the theorem, it is sufficient to show that 


or 


or 


Now 


£2 X ^ ^ Pku 

J (fc ~ \A)Pku+i — ¥ Pku > o 

Jfc -1 

Pku-i — 2 Pku) ^ 0 

t -1 

- 2 «„ = y (i bi)^. > o 


^ that< £(F 3 ). As it has already been shown that 

a ^ *> - h \ v i), the theorem is proved. a 

stratified random ^ 6Gn {l rov ?^ systematic sampling is superior to 

“;„net: r P :s g r 3 > hoids ' u if the 

inwossible nnp f lfin £ ^ COI1 dition does not appear to be an 

Several authors, includinrWokUmgf'cal applications are concerned. 
Mackenzie MQ 99 , i h 8 ‘ U938), Osborne (1942), and Fisher and 

natural'populatfon& aVe P " P °" d ** -dels for specific 


3-8 CONTROLLED SELECTION 

ceptually as t 7* ^ * ** S 

-ny purposive^amples^ 2^ “ ^ ™ ^di 

one or more samples. The nnmK y Umt ln the universe is included 
must be exactly proportionate t Sairiples in wt “ch each unit occui 
After the complete set of m • lt8 assigned probability of selectioi 
P 6 8et of Plosive samples has been established, * 


OTHER developments 

random selection of one of them constitutes a probability sample. In 
the subjective approach many samples are examined and discarded arbi¬ 
trarily; with this technique the preferred and other samples are recorded 
and assigned probabilities of selection, the final choice being a probability 
selection from the totality recorded. An example of the use of this 
method has already been given in Sec. 4.8.3. The following simple exam¬ 
ple will be presented to make the ideas clear. The universe consists of 
nine schools; four (Li,L 2 ,I/ 3 ,L 4 ) of them are large and the other five 

are small. Two of the large schools, namely L\ and L 2 , 
are state-controlled, and the other two are under private management. 
In the case of small schools, Si and S 2 are state-owned, and the others'- 
are run privately. The problem is to select a sample of two schools, one 
large and one small. Each large school should have a 3^ chance of being 
included in the sample and each small school should have a 3^ chance of 
being included. Preferably, the sample should contain schools of either 
type (state-owned and privately managed). We may then list the follow¬ 
ing eight samples and select one of them with the probabilities specified 
below. 


Sample 


L*S< 

LiSi 

L t S t 

LxS 6 

LiSi 

LzS* 

L4S5 

Probability of sample 
Cumulative probability 

0.20 

0.20 

0.20 

0.40 

0.20 

0.60 

0.20 

0.80 

0.05 

0.85 

0.05 

0.90 

0.05 

0.95 

0.05 

1.00 


It can be verified that the large and small schools have a chance of 

0.25 and 0.20, respectively, of being selected in the sample. And the 

probability of selecting a preferred sample (containing schools of either 

type) is 0.90. In case a school is selected at random from the four large 

ones and another one is selected independently from the five small ones, 

the chance of a preferred sample would be only 0.50. Thus the method 

of controlled selection has enabled us to increase the probability of a 

preferred combination from 0.50 to 0.90. This problem can also be solved 

by the methods of linear programming (see Sec. 9.5 and Exercises 92 
and 93). 


9,9 A GENERAL RULE FOR VARIANCE ESTIMATION IN MULTISTAGE SAMPLING 

® u Ppose n psu’s are selected from the N without replacement with 
unequal probabilities. To begin with, we shall assume that the sample 
P8u ’a are completely enumerated (single-stage sampling). In order to 

^timate the stratum total, the very general estimator £ auy< will be 



216 


SAMPUnq 


theory 


v (J <w) = X y<*V(a i3 ) + 2 ViVi Cov {a ia ,a jt ) 


Let 


j>i 


(9.24) 


Ay) = X bity<i + X dij,yiVi 

• - V _• 


j>i 


(9.25) 


be an unbiased estimator of V(2a ia yi), where, like a„, the real numbers 
b^dijs are predetermined for every sample s. It follows that 


E{b it ) = V(a it ) 


(9.26) 


Now consider the multistage case in which the psu’s are subsampled 
independently in a known manner. Given the ith psu, let U (based on 
sampling at the second and subsequent stages) be an unbiased estimator 
of y.u Further, let V(U\i) = <r,- 2 and E{W\i) = a?. 

As an unbiased estimator of the stratum total we shall use 


N 


? = x 


(9.27) 


wlierp 


1 = 1 


(9.28) 


V(?) = V Q) a„ yi ) 

1 

Then we shall prove the result: 

E[f(t) H-Sarf] = 7(f) (9.29) 

where fit) ^ b ls t t + ^ d lJB titj is obtained from f(y) by substituting 

. t 1 i>* 

U for yi. 


PROOF 


result f 


Em = Ef(y) + Elbuoi 1 
= Ef(y) + ZV{auW 

= V(t) - 2^^ 

lows (Raj, 19666) ' Infj ' aSed estimator of 2<r, ! , from which the .cau.. 

then be stote/as ?,<£? VarianCe in multistage sampling « 

in single-staffp an unblase d estimator of the varia 

estimate U. Also o-e^th \ COpy of ^ b y substituting for y< 

single-stage samnlino- k 6 COpy . 0 ^ ^ be estimator of the stratum tota 

»» S31SS3 *’» "• 

ator of the variance in the multistage ca? 


used, where a« (i = 1, . . . , N) are real numbers predetermined f 
every sample s, with the restriction that a ie = 0 whenever the sam 1 ° F 
does not contain the «th psu. In order that the estimator be untoas^ 
whatever the y’s, the condition is that E(au) = 1 for every % ^ 

variance of the estimator would be e 


-4 




other developments 

It may be noted that this rule is slightly different from +w • u 
Durbin (Exercise 56). This rule will be more handy when the estimators 
are based on condrtromd probabrhUes m which case it may be difficult to 
calculate «, the probability with which the ith psu is selected in the entire 
sample. This rule was noticed by the author (1954, 1956) in connection 
w ith two different sampling schemes. As an example, consider the two- 
stage design of Theorem 6.2. The population total estimator in single- 
stage sampling is 


-% 

n 


and the variance estimator is 


N 2 / n\ 1 

n \ ivy n - 1 S( - y< ~ 


The copies of the two are 

-S—(l 

n nii \ Mi) 

N 2 / n\ 1 / 1 \ 2 

and — (1 - tz )- - S ( M<yi - SMfli ) 

n \ N / n — 1 \ n ) 

Hence the sum of the two copies is an unbiased estimator of the variance 
in the two-stage design. 

To take another example, consider the sample design (Sec. 6.10.3) 
in which the stratum is divided up into n substrata by allocating the 
psu’s at random to them and in which one psu is selected with pps from 
each substratum. It is known that in single-stage sampling the stratum 
total estimator is (see Exercises 16 and 54) 


n 

v Ha 

fl/.. 

1 = 1 ru 


(9.30) 


a nd the variance estimator is 


Z^-N r . 


IV* -2 Ni 1 


Ilss-ds)'] 


(9.31) 


^ ence an unbiased variance estimator in the multistage case is given by 
° Ur rule as: 

T(p) - - N \y J£. - (t ^Y1 + i ^ 


(9.32) 







211 sampling theory 

9.10 SAMPLING FROM IMPERFECT FRAMES 

We have by and large assumed that given a target population with 
reporting units 

Ui, U 2 , . , Un ( 9 . 33 ) 

there exists a perfect frame (or list) of these units from which to select 
the sample. In practice such frames rarely exist and one has to fall back 
on imperfect frames. The frame may be imperfect in that it does not 
contain all the units of the target population, or some units may occur 
more than once, or some particulars (e.g., measures of size) of the listed 
units are inaccurate, or that certain units in the frame do not belong to 
the target population. Thus the frame has on it the units 

fi> f%> ■ • ■ (9.34) 

and the sample is to be selected from this and not from (9.33). The 
problem is to establish rules of association between the listed units f k and 
the reporting units U, such that the selection of units fk with known 
probabilities leads to the selection of units Uj also with known proba¬ 
bilities. The selection mechanism should give rise to known nonzero 
probabilities 7r,- and 71 for the units in the target population. This is not 
always easy. Suppose the target population (9.33) consists of alHndi- 
viduals currently resident in a city, whereas the listed units in (9.34) are 
addresses taken at the time of the last census. It may be found that 
some houses have been demolished since the last census, others have 
come up since then, some single units at listed addresses have been con¬ 
verted into multiple units, and so on. All this will have to be taken into 
account when (9.34) is used for sampling from (9.33). A general dis¬ 
cussion of some of the problems involved is given here. 

Extraneous units If the frame contains some units which are not 
in the target population, their selection in the sample will give rise to a 
value of zero for the characteristic y under study. No bias is involved 
thereby, although the variance will increase somewhat. As far as practica¬ 
ble, such units should be removed from the frame before sample selection- 
Duplications If some units occur more than once in the frame, this 
will affect their probability of selection. No real problem is involve 
when these probabilities can be ascertained. To the extent possible, t e 
frame should be unduplicated before selecting the sample. When t 
extent of duplications is not known, it is possible to estimate it on t e 
basis of a sample taken from the frame (Exercise 101). If there are t* 
frames available, sampling methods will help in estimating the numb er 
units common to the two (see Exercises 87 and 88). re 

Inaccurate sizes If the sizes of some of the units in the frame 




OTHER developments 

inaccurate and this information has been used for sample selection, it 
U happen that some unexpectedly large or small units appear in ihe 
stratum. Thls wl11 increase the variance of the estimates made. When 
the survey is repeated over a period of time, the surprise stratum technique 
(Exercise 67) may be used in order to reduce the impact of the unusual 
units on the variance. 

Incomplete lists If the list is incomplete (i.e., if some units in the 
target population do not find a place on it), it should be supplemented by 
other lists or area samples in order to take in units not on the list. (In 
an area sample the sampling units are land areas and the reporting units 
in the sample are identified through geographic rules in the field.) The 
area sample may contain some units which are on the list. One procedure 
is to remove such units from the area sample, since these have a chance of 
selection through the list. A better method is not to exclude them from 
the area sample but to use the procedures indicated in Exercise 89 and 
Exercise 96(a). Sometimes it is possible to locate the successor of each 
unit in the target population. In that case a sample taken from the list 
will lead to a sample from the target population if the method indicated 
in Exercise 96(6) is used. 


9.11 SAMPLING INSPECTION 

A large-scale survey is an exercise in statistical engineering. Each step 
in the production line is a potential source of error. The sample units 
may not be identified correctly, enumerators may make errors in the 
field, there may be errors of coding or punching the cards in the office, 
J and so on. Thus it becomes important to ensure that the production 
1 process is under control and that the outgoing quality is acceptable. 
Sampling methods can play an important part in achieving this. The 
problem of control of clerical errors by inspection on a sampling basis 
will be discussed here. 

Suppose punched cards are being received in lots of A. We agree 
to ca H a card defective if it contains one or more errors of punching. Let 
the sampling plan consist in selecting a sample of n cards from the lot for 
verification. If the sample contains c or fewer defective cards, the lot is 
‘'fcepted; otherwise it is verified on a 100 percent basis. In either case 
he cards found to be defective are corrected. If the proportion of defec- 
' 1Ve Car ds is P, the probability of accepting the lot will be 

w = x 

rf -0 



(9*35) 


SAMPLING THEORY 


220 


■ that the lot size N is very large as compared with n. 
Assuming that tne 



P d (l — P) n ~ d 



The graph of L(P) against P, 0 < P < 1 is called the operating 
characteristic (OC) curve of the sampling plan (n,c), and c is called the 
acceptance number. The OC curve gives the probabilities with which 
lots of different quality are accepted. 

Let the lot be considered satisfactory if the proportion defective is 
Pi or lower and unsatisfactory if the proportion defective is P 2 or higher 
The proportion Pi is called the acceptable quality level (AQL) and P 2 
the lot tolerance percent defective (LTPD). Obviously, L(P 2 ) is the 

chance of accepting a lot at the LTPD level and 1 — L(Pi) is the chance 
of rejecting a lot at the AQL. 

Since rejected lots are to be inspected on a 100 percent basis the 
expected amount of inspection will be ’ 


C(P) = nL(P) + 2V[1 - L(P)] 




the ImpB„ U gX C See ZbeTl* ^ °‘ 

the proportion defective in + ■ nd \ be defectl ve are corrected, 

will be smaller than P. I ts exp^ted v^ 00111 ^! Sample ins P ection ) 

expected value will be given by 


which can be approximated 



(9.38 


as 


Pa = 



" p I 1 ~ h C(P) ] 


(9.3S 


This is called th 

(A °Q> samplii 

ln the sampled hmit (AOQL'i T>i! 0ns * n ^ * s known as tl 

the plan will be so detei 


DEVELOPMENTS 


221 


OTHER 

mined that 

L(P 1 ) = 1 — a 
L{Pz) = j8 

Further reading See Exercise 103, in which the two parameters of the 
sampling plan are obtained. 

Alternatively, lots at LTPD may be accepted with a frequency of 0, 
and the cost of inspection be made a minimum for lots of expected quality 
p In this case 

C(P ) = nL(P) + N[ 1 - -L(P)] 

is to be minimized, subject to the constraint that L(P 2 ) = j8. Finally, 
the AOQL of the plan may be specified as P o, whatever the proportion of 
defectives in the incoming lots be. The sampling plan chosen is one 
which minimizes the cost of inspection for product of expected quality 
(Exercise 104). 


9.11.1 DOUBLE-SAMPLING PLANS 

The plan considered in Sec. 9.11 is an example of a single-sampling plan 
characterized by the sample size n and the acceptance number c. How¬ 
ever we may accept exceptionally good lots and reject exceptionally bad 
ones’outright and give a second chance to lots of intermediate quality. 
Such plans are called double-sampling plans. A sample of n, cards is 
selected and the lot is accepted or rejected depending on‘ '^ether the 
number of defectives does not exceed «, is 

defectives lies between C\ + 1 an d C2 > . . - i n + exce ed 

taken. The lot is accepted if the nu-nber c 

Ci in the combined sample of ni + , 

operating characteristic (OC) of the plan will be given by 



J 


222 


SAMPLING 


Th EOr^ 


The expected cost of inspection will be 


C(P ) = rii V (n h di) + N J (n h di) 

0 ci+1 

ci—ci —1 

+ (ni,Ci + 1) [(»» + tt 2 ) X (W 2 ,da) + iV £ (w 2 ,d 2 )l 

0 Cl—Cl J 

+ ■ • • + (ni,ci) [ (n x -f- n 2 ) (n 2 ,0) + iV ^ (n 2 ,e? 2 ) J 

£* ct ci —di 

^ V* /— -j \ i /_ i \ n n , _ . 


1 

= n ' i MO + ("> + «t) I V ‘ (n,,rf,) (n, ds ) 

0 <*t-ci + l d »-0 

+ " 1 X. + ,„i. <*'*><•*«] 

lO _ /* . • 


. L * \n\>a.x)\ + iv[l -- L(P }] q a 

= - PC(P) P , 


* 4 ao pxuu. 

•,=_ p, 

* -^{W-n l)£ (P ) _ nt |- i ( i > ) _| (n ^^ 

^ ^ y —- 



REFERENCES 

Bryant, E. c. H O 

^ “ d « 


Design and es 

8&wplmg fr °w finite poi 



OTHER DEVELOPMENTS 


Hansen, M. H., W. N. Hurwitz, and W. G Madnw nou\ «« , « 

Methods and Theory.” John Wiley & Sons,'Inc New York. P S ” VV 

Hartley, H. 0. and J. N. K. Rao (1962). Sampling with unequal probabilities 
and without replacement. Ann. Math. Slat., 33. proDaomties 

Homt*, D. G. and D. J. Thompson (1952). A generalization of sampling 
without replacement from a finite universe. J . Am. Stat. Assoc., 47. 

Keyfitz, N. (1951). Sampling with probabilities proportional to size-adjustment 
for changes in the probabilities. J. Am. Slot. Assoc^M. J 

Koop, J. C. (1963). On the axioms of sample formation and their bearing on 
MetHto T ° eStimat ° rS in samplin « theor y finite universes. 

Koopmans, T. C. (1951). "Activity Analysis of Production and Allocation.” 
John Wiley & Sons, Inc,, New York. 

Lahiri, D. B. (1954). Technical paper on some aspects of the development of 
the sampie design. "The National Sample Survey,” No. 5. Government of 
India, New Delhi. 


dnnTjT SfcT'ls ’ H ' M&d0W (I944) ' 0n the theo,y of systematic sampling. 

Osborne, J. G. (1942). Sampling errors of systematic and random surveys of 
cover-type areas. J. Am. Stat. Assoc., 37. 

D ‘ (l 956 )- 0n the method of overlapping maps in sample surveys. 
Sankhya, 17. 

(1958). On the estimate of variance in sampling with probabilities pro¬ 
portionate to size. J. Soc. Sci., 1. 

- (1964). Some apparently unconnected problems encountered in sam¬ 
pling work. Contributions to Statistics. Calcutta. 


- (1965). Variance estimation in randomized systematic sampling with 

probability proportionate to size. J. Am. Stat. Assoc., 60. 

-(1966a). On a method of sampling with unequal probabilities. Ganita 

17. 


- (19666). Some remarks on a simple procedure of sampling without 

replacement. J. Am. Stat. Assoc., 61. 

Boy, J. and I. M. Chakravarty (1960). Estimating the mean of a finite popu¬ 
lation. Ann. Math. Stat., 31. 

Satterthwaite, F. E. (1946). An approximate distribution of estimates of 
variance components. Biometrics, 2. 

Wold, H. (1938). "A Study of the Analysis of Stationary Time Series.” 
Uppsala. 





EXERCISES 


METHODS OF SAMPLE SELECTION 


1. In order to estimate the mean of a finite population, sampling with 
replacement with equal probabilities is continued till the sample contains 
n distinct units. Let v be the total number of se ections made, MZfc- ») 
being the frequency of appearance of the rth distinct unit in the sample. 
Defining y. = Sk,y,/v and = Sy,/n, prove that 


a. y v and y n are unbiased 

b. V(,y.) = E - <r„ 2 

V 

'■ «W = N (i + ^rrr + ' • ■ +iT^n 

d. E-> — > (N - n)[n(N - l)] -1 
v E(v) 

Hence or otherwise prove that V(yv ) ^ V(yn)- 


. - - t 

226 

, • at nniis the variate-value of one unit being 
?■ A P° pu b 1 e ttt ‘° n "random simple of » units is selected from the 
" % -D unt show that the estimator y, + (N - l) t has 
TsmaUer variance than Ny» based on a wtr random sample of sme « 
taken from the entire population. 

3. For the general sampling scheme of Sec. 9.4 consider the estimator 

N 

T = J for estimating the population mean ZF f /AT. Suppose T 

t —i 

• ___j _ _ _i_ x _ it _ _r__ J7/m\ _ _i_i • 


- ^ * \ / & • - O * * 

t —1 

is unbiased and belongs to the class for w 
constant. Then prove that 

o. Bo,(s) = ~ 

». JhftfcM] - ± + fci) 

and «[*«<%«] = i-=-* 

N 2 

_ TTw v a 


' X A-— w * 

which F(r) = for 2 , where A is a 


i =j 


i 9 * j 


c. F2oi(s) =0 so that 2a,($) = 1 

By computing M - ElfaU) k 

lower bound for F(T), namely F(T) > ^ ^7 ^ there exists 

*=(# 1] 

¥ w kll I- v«W 

■ s ^ a H denote by Fi tb^ ™ • 

“'ivrc r—W 

«(P.) = JVVKJ^ L a *. E(e) = 0 , f 8U ™ ed bein 

Hence prove that > VN n £ (Fj) _ <(e ) -. ,e Show ^ 

that cubes and hielL < # <r 2 < R2„. t,*r * -+ W ~~ l)<r 2 l/r 

$ at *(V 0 < if P T" S 0f <* ' AssuX 

f = / /2 <VX). ( ’ ? C - ! /(l + cj) where C'" e l' BCW ^ prov 

ni„, ssume that the finite _ . * am 


c i ' v - ' • ; where r 2 — w l * 

; As sume that the fi = tr 2 /X 2 

e stimator &,/ PS e f lmat or ( 1 1 and K the v«' + 

V/ "’ ah °w that /lBjV) )%./p.) and th 7 nances <* the « 
*<T,) - Uen . WI eqUal P'-^ai 



227 


eX ERClSES 

tfence prove that the pps estimator is superior to the equal probability 
estimator if p(x,x ff_l ) > -(AT — l)B 2 S x /(NaS x a ~ 1 ). Hence for g > 1 the 
pps estimator is always superior. 

0 In a wtr sampling scheme with unequal probabilities, let Vi be the 
variance of the estimator S(yi/n) and let v = %,•/?,■ — (l/n)jS(y$/p<)]*/ 
\n(n — 1)] be used as an estimator of V\. Then prove that E(v) — V\ = 

_ Vi)n/(n — 1), where V 2 is the variance of ( n~ l )S(yi/pi ) in wr pps 
sampling with ir t - = np%. 

7 . In samples of size n, in which the first member of the sample is 
selected with probabilities (i = 1, . . . , N), 2 p< = 1 and the remain¬ 
ing (n — 1 ) members with equal probability without replacement, prove 

that 

«r, - *« -(.If- 1 - n) [(AT - n)p iPi + ~ If ~ P " ] 

Hence or otherwise prove that Yates and Grundy’s variance estimator 
5 '[(jr,iry - icy) (Vi/vi - yj/rj) 1 /^] would be positive in this situation. 

8. It is proposed to select two different units from a stratum in order to 
estimate the stratum total Y. The probability of including Ui in the 
sample is desired to be -it ,• = 2x t /X (i — 1, . . . , N). Assuming that 
the relationship between y and x is linear, prove that the variance of 
S(yi/in) would be a minimum if the 7Ti, are so chosen that >0, ^ = n 

ifti 

1 

(i = 1 , . . . , N) and £ [W(w)l is minimized. 

9. a. In sampling with unequal probabilities for samples of size two, 
when the first unit is selected with pps and the second unit with pps of 
the remaining units, prove that Yates and Grundy’s variance estimator 

would be positive. . . ... 

6 . If the first unit in the sample is selected with pps, the second with 

Pps of the remaining units, and the other (n - 2) units selected with 

equal probabilities without replacement, prove that Yates and Grundy s 

variance estimator would be positive for this sampling system. 

10. From a population of N units a sample of two different unite is 
selected in the following way. The first member of the sample '^elected 

with probabilities „ (i = 1 .TO. = V^ 0n ””ate 

s «e The second selection is made with probabilities propor 

to 2 », £«, = 1 , leaving out the unit already selected^ The «.^e so det - 

■Pined that p’ = £ [ Mi /d - Si)l « = • • • - ^ Pr ° Ve * h * 

«• The chance that U< is seI “ ted ”. % p“”ppr^ri^ to this sampling 
k* The variance of the estimator S[y %/C ** 






SAMPLING 


theory 


cheme is less than or equal to the variance of the same estimator in the 
Z W J wr sampling with pp to * 

n Fr om a stratum two clusters are selected by the following procedure. 
Two independent selections are made with probability - N</N , whwe 
N- is the number of elements in cluster i. If the same cluster is chosen 
twice both selections are rejected and two further ones made, the process 
being continued until two different clusters are chosen. If the clusters 
are enumerated completely, show that the bias of the simple estimator 
t = (yi + y,)/\2 is given by ( Zq^qiYi — 2 g,- 2 F »)/(1 — 2 g.- 2 ), where y i} g. 
are the means per element of the sample clusters and the purpose of the 
sample is to estimate the stratum mean per element. If the mean square 
error of t around the stratum mean is estimated by u = (&• — &) 2 / 4 , s h ow 
that u will overstate the MSE and B(u) = 2 <fr 2 (F t - — F) 2 /(l — 'Lq*). 

12. From a population of N units a wtr sample of n units is selected 
following the general scheme of Sec. 3.17. Defining t' n as 

l’n = [(N - 1 )(N - 2) • ■ • (N - n + l)]” 1 - — _ 

PilPji ' • • p ln 

n 

prove that f = £ c,<-, 2 c< - 1 is an unbiased estimator of the population 

tarilie frtJTTr f ° r l { ° and C ° V «W>- Hence find the 
variance of (. [Note that p„ = Pr(C7,-||7i) and so on.] 

13. In wtr random sampling from a population containing N units, there 
wiU be different samples of size n, each sample ordered in n! = M 

on the sample s ind ordw of t he population parameter based 

ordered sample, p, = Y j>(. i) atl j >7 ■! >y p S, > tlle Probability of the 
’ V \ PM and P M = P(M)/p.. Prove that 

F[<M i = ??' lp ( ^-t 2 X ( p ( v-)] ! 

Henc th^ ¥ = ^ P ‘ ~ K 2 'PfefJ ] 2 

to the ordered a^M 1 ™“^’ 7- is superior 

14. Let a systematic sanmlp f P ntof giving a lower variance. 

htrr( be ‘ W - 1 “d 1 M frL hUndred , th house hold be taken with 

hoCLlt' 7 1 and k -e iuteg7” l P h T lati0n COntai ™S 1004 + * 

‘ + 1 “ t l “ 8 a “P>o WU1 beT S ° \ k S 99- The number of 

be h + k/m thT 6 ° f k/m - Sh »’w that ti anCe ° f 1 ~ k/1(M ' and 
' variance of the samni ■ 6 ex P e °ted sample size will 

sample size being (l - k /lOO)k/m. 


229 


E XER C,SES 

the distribution of k can be taken to be uniform over the range 0 to 99, 
b w that the average variance would be very nearly Instead of 
S Vng i’s independently at random, we may select them in complementary 
^. gUC h that j + j' = 101. Assuming the distributions of k and k' to 
he independent and uniform, show that the average variance now reduces 

to 

15 A population contains N = nk units where k is odd. We shall denote 
' ^e mean of the systematic sample based on the random start i 
taken between 1 and k. The centered systematic sample estimate will 
be obtained by taking the mean of the central units [numbered (k + l)/2, 
k _|_ (fc + l)/2, .] from each of the n strata formed when a random- 

start systematic sample is selected. If the population is monotone 
increasing, the centered systematic sample mean y c will be the median of 
the k random-start systematic sample means y x < y 2 < ■ • * < 2/*- The 
mean square error of y c will be (y e - Y ) 2 while the variance of & would be 
S (£. _ YY/k. Use the result, (mean - median ) 2 < variance, to prove 
that centered systematic sampling is more efficient than random-start 
systematic sampling in the case of monotone populations. 

16. a. In simple random sampling the variance of the sample mean is 
given by Y(y) = (i - i) j^-j- Q yf ~ NY*)- Denoting by v 
unbiased estimate of V (y) and noting that 


an 



E(y 2 - v) = Y‘ t 


show that 


E(v) 




- N(y 2 - v) 




Use this equation to arrive at an expression for v. 

b. In the case of with-replacement pps sampling, the variance of 

^ = ~S(yi/pi) is given by (l/n)[S( 2 /,Vp») — F 2 ]. With (1 /n)S(yi z /pi ) 

as an unbiased estimator of S(^ 7 p<), use the above technique to obtam 
an unbiased estimator of V(Y). , 

c * For the sample design in which the population is split at random into 
n substrata containing N t (i = 1, • • • , ») units, and one unit is selected 
wi «i PP to * from each substratum (Sec. 6.10.3), the variance of F is 
6 *ven. by 


U(F) = 


siVi 2 - n /y y y_iL--Y* 

N(N - 1 ) \? i Pii 


» THEohy 

UsiM y -^4-aa an unbiased estimator of £ £ W/Pa), show that 
i’rxiPa •' > nat *n 

unbiased estimator of V(f) is given by 


~ N (j J»L _ _ 

N‘ - IN? VV p.,^ ) > P ” 


x<i 

- ~Wi 

X 

i-i 


J-l 

ir. In order to estimate the mean of a finite population a samDl f • 

n is selected with replacement and the number u of distinct unit! °! ! Ue 
mined. Let the estimator used be deter- 

S ' = ~ »« = -B/M 

Show that y' is unbiased, with variance given by 

V(s,) = [ E ~ r-^ + (?>- s ~)m 

Mh "“* “ which a 

Show t h t v i8 ^ ^ thM 

** —- ^ 

Tk p i* t 1)st draw when the sample first«, , Seectlon 18 terminated at 
The last urnt is rejected, and the ™ a a talns * + 1 Afferent units, 
erent umts selected. (This nroeed ° f sam P le consists of the n dif- 
nnequal probabilities.) An observed™, 18 C > aUed ‘ nverae ““PBng with 
U, fh ’ ’ ' ' ,u,) ’ where u , pIe of r unit8 is recorded as 

’ *(T*llTx Pl t° wtha ‘ the pr «?s t sIteVi^ l i3• 2d ’ 

° f “>“‘4 4 and p' _ ^ ' 

the sense t hat t ir d0m ^bles u, “ Sa “ ple in tha ‘ »der. Given 

of “a «i, . J 01 ® 1 distribution is invari k/' are interchangeable in 

«< . u > [ £ >ve further that ^ Under “V permutation 

, J ° f n dd ferent units IwWtOi* of *» sa ”P le 

• nr! U «y 




TJ * ^ pj -- - 

Henc e prove that tk U " *>i — 

that ^ Procedure of m Versfi ^ 

sam Pbng with unequal prol 


ZEROISES m 

bilities is equivalent to selecting the first unit with pps, the second with 
pps of the remaining units and so on. 

In the sampling scheme of Exercise 18 let the sample of r units be 
an< ^ ^ 2 denote v/V where y is the characteristic of the 
unit and p is its probability of selection at any draw. Using the fact 
that the random variables Z\, zt, ... ,z r are interchangeable, prove that 
2 » Szi/t is an unbiased estimator of the population total Y and that 
v(l) = S(ti ~ 5)Vl r ( r ~ 1)] is an unbiased estimator of V (2). Prove that 

7(2) = E[i’(2)] = E 


= £ PjPA z i — tj’YE I “t “ U‘’ U2 = Uf'j 


and for n = 2 


-E [; I«. - &*«• = u ‘] “ \ Pi + T% Il0g (1 " Pi ~ Pi,) + P ‘ + PA 


Consider the alternative estimator t = (fi + fa)/2 


r 

t\ = ti ~ Vi + 

Pi 


where 

Show that in this case 


l/Ul - pi) 


n — 2 


P2 


V(D = i X fe - ^) ! W 2 - p > - Pi ' ] 
Hence obtain the condition that t has a smaller variance. 


STRATIFICATION 

20. In stratified random sampling the population mean ^ S . ^ multipliers 
W4k/AT with a variance of ar-«r(AT»W/«*> ^ - i )V 

«thin strata are ignored. If **“ 'wfuHbe ^ 

1), show that the variance 0 rnve that under the cost function 
^ N-'WSSi'nrHPn - D- Hence prove that under 

C = Sc.n h , the allocation of the total sample of size " to “e 

minimising the variance of the variance estimator would 

Under what conditions is the optimum allocation for variances the same 
*8 that for the means? 


23J SAMPLING THEORY 

21 A population consists of k strata of sizes Nj and mean values p. 
(y *= l, . . . , k). We are interested in estimating r linear functions 
L . = £ ujj (t = 1 , . . . , r) of the strata means by selecting Uj units 

at random from within the strata. Assuming the cost function to be 
C = 2 cy 7 i/, g > 0, obtain the values of the n s such that for a fixed cost 
the expected loss given by £ mV (Li) be minimized. (Li is an estimate of 

t 

Li and m> Uj are known constants.) 

22. A random sample of size n is selected from a population and the 
sample units are allocated to L strata on the basis of information collected 
about them. Denoting by n h (a random variable) the number of sample 
units falling in stratum h, show that the variance of 2W h y h (with W h known) 
would be given approximately by (1/n - 1/W)2W^ 2 + (l/n 2 )2(l - 
W h )S h 2 . Compare this variance with the one obtained in stratified ran¬ 
dom sampling with proportionate allocation when it is feasible to select 
units within strata. 


23. A population is divided up into two strata with NJN, = d FUR - \ 

! ni/nJ/ W , where * - a general allocation 7the 
total sample size n, and n„ n, is the optimum allocation for purposes of 

“el 116 P°P“ la «oa mean in stratified simple random samphng 

Sir » «. 


Ot - n(\d + l ) 2 (\fjd + l)-i(Xd + M )-i 

vaiues ° f h sh ° w that the “p^- 

a < 2. hat there 18 no appreciable loss of precision for X < 

range is cut in L equalTartsT^ dl8trlbuted in the range a,a + c. The 
stratum a simple random sam T^r L Strata ° f equal size - From each 

‘he varfancrb^r^^ 7* “ *“• *■“*■« * 

n res Pectively, prove that V - xr / T anc ^ unstr atified samples of size 
25 p ^ V ^ • 

l (z) denote the ordinate at x andWf^ Z ? r ° mean an< * unit variance > let 
an **• If this distribution + Xl,x ^ ^e relative frequency between 
^ = an M d the varia ^ truncated at * and * * show that the 

Xl ) K%i))/W(x runcated distribution are given by 


<r 2 



g foi) ~ l(x 2 )]* 


ZEROISES 233 

Hence divide the positive half of the normal population into L = 2, 3, 4 
strata of equal aggregate output WhUk- If the total sample size is allocated 
equally to the strata, show that the variances of the mean are propor¬ 
tional to 0.036, 0.023, 0.016, respectively, the corresponding figure for 
unstratified sampling being 0.091. 

26. A population is divided up into L strata, N a being the number of 
units in stratum i. Information on ‘an auxiliary character x is known 
for each unit. A sample of size n is selected from the entire population 

Ij 

such that the probability that s n be selected is proportional to (X#<&)#„, 

where Xi is the mean of x based on the n, sample units taken from stratum 
i Prove that y 3 t = X (2 Ntyj)/hN{Xi is an unbiased estimator of the popu¬ 
lation total Y. Obtain an expression for the variance of y. t and an 
unbiased estimator of V(y» t ). 

27. A population is divided up into L strata, the imputed size of a unit 
in the tth stratum being Xi. Assume that ya = a + fix* + e.y where 
E{ea\xi) = 0, V (e^z,-) = axS. Denoting by V p , V 0 , and V pf>a the vari¬ 
ances of the stratified proportionate, stratified optimum and unstratified 
with-replacement pps estimates of the population mean, show that 


a v* o v 

E ^ - za l * ~ wi l * 


= 0 


^ (l Xi ' n Y -ifil* 

B(v -- ) = ( x X x< ‘~' -X*0 01 = 0 

Hence prove that the stratified optimum estimator is superior to the pps 
estimator. Further prove that the condition for the pps estimator to be 
superior to the stratified proportionate estimator is p(x,x 9 *) > 0, piou e 
that (n — l)/N is negligible relative to unity. 

28. A population is divided into L strata, stratum h containing A. 

(torn whicH * (h = 1, . . • , L) are to be taken into the sample he 

following procedure is used. One unit is selected with 

Wire population. If this unit comes from stratum ^ ^ ^ 

sample of further n h - 1 umts is taken fr cificd s j zes are taken. 

1,r °m the other strata simple random samples -P 

Show that 2N h y h /2N h x h is an unbiased estimator o ■ ' 

bivariate distribution with fre- 
9 - There is a symmetric contmuou The problem is to divide 

Muency function f(x,y), a < x < b, c V - ’ lineg pam ii e l to the axes 
his population into four strata by 1 




SAMPLING 


234 


THEORY 


. .. n : nt ( Xn Vo ) If sample allocation is proportional, show that 
Si St U <***>. for which the generalized variant 

1 V(x) Cov (x,p) I 

Cov (x,y) V(y) 


is a minimum, is the center of gravity of the distribution. 

30. In a stratified design it is required to estimate the means F, 0 f k 
characteristics (i = 1, 2, . . . , k). The problem is to allocate the total 
sample of n units to the L strata in an optimum manner. In order to 
achieve this, the weighted sum of the variances 2a,F(i#,-), a, > 0, 2a,- = 1 
is minimized subject to the constraint 'Luk — n — 0. Show that 


n h « nNhiZoaSih 2 )* 


This gives a set C of allocations n^, . . . , n L ) for different vectors a, 

Show that this set is complete in the sense that given any allocation 

Hn'un'i, . . . ,n' L ) not belonging to C, there is an allocation n°(n 1} ... n L ) 

in C which is better than X. The term “better” is understood in the 
sense that 


[F(3fi)]„, < [V (M,)] x for all i 

fV’(J0 r *)] M , < [F(j0jfc)] x for at least one k 


in 

sense that 
and 

r —» - — —- — VUV IV 

i rr ts in Ti 

it isnTknom wbTchunit “'“u ““ Sa “ ple from within substrata, sine 

random ofn lZl "f l "J ^ Substfata - Thus a *tr simpl 

sample units are allocated to the * k T ” 1 tbe entlre P°Pulation and th 

collected about them For, . SUbs ‘ rata on the <* informal 

■ f or a umt m the sample, we define 

4% l/ 


y' ~ y x = i 

y' * o x = o 


] {the unit is in cell (i,j) 
lf the unit is not in cell (i,j) 


mi 0 ~ is not in cell (i i 

Then Sy'/Sx is a ratio estimate of th - 

Theorem 5.3, we have * ° f ^ W) ’ 


Hence, 


Since y' - f 


Sx 


*<SH 


■] 


s (y' - ? ijX ) 

w,s «. >t IS easy to see that ^ ^ “ mt belon SS to cell (i,j) an d zero oth 


y§y^ 

Sx 


~ (0 G' jf) 



w 


£X £R c| SES 


gence show that under certain conditions (to be stated), 


where 



(» " h) JTTi 11 {N » ~ 1 w 

* i 

? = y y NqSy' 

Sx 


235 


Consider another situation in which the stratification in direction j is 
ignored so that Ni units belong to the stratum i {i = 1, 2, . . . , L). 
Prove that the variance of It = 2(NiSy'/Sx ) in this case is given by 

' ' f 

Prove that, correct to the approximations used, the latter estimator which 
makes use of smaller information on the population has a larger variance 
as compared with the former estimator. 


RATIO AND REGRESSION ESTIMATION 


32. In wtr simple random sampling it is usual to use y/x as an estimate 
of R = Y/X. Obtaining an exact expression for the variance of y/x, 
show that a sufficient condition for the usual approximation V(y — Rx)/ 
( X )* to be an understatement for V(y/x) is that 



(y - RxY 


> 0 


where p stands for the correlation coefficient. 

32. Let y if (i = 1, . . . , m) be unbiased estimators of Y and X 
respectively, based on m interpenetrating subsamples of the same size. 
Prove that fX +m(y - fx)/(m - 1), where f = mr l S(yi/xi), is an unbiased 

estimator of Y. 


33. If the regression of y on x is linear, that is, E(y\x) ox + b, show 
that in simple random sampling the estimator y/x will give a smaller 
lar ge sample variance than the ratio-type estimator f + (N - l)n(y - 
miN(.n - 1)X]-' where f = rr'S(y</x,). Prove this Mult by using the 
torollary to Theorem 5.8, showing that R- 3 = h / x , * P bB(l/x) 
and using E(x)E(l/x) > 1. 

3S ■ Let«, i, (i = I . . .n) be unbiased estimators of Y and X, respec- 
* ivel y» based on n interpenetrating subsamples. Gonsiderthefollomng 

**« estimators of R = Y/X: 6, - “ " S(y ‘ /Xil U “" g 




236 


sampling theory 


Theorem 5.2, show that approximately 

B(R>) = «-£(&) 

Hence show that an estimator of B(Ri) is given by (R 2 — R\)/(n — 1) 
so that Ri — (R 2 — Rx)/(n — 1) = (nRx — R 2 )/(n — 1) is an almost 

1 • 1 i • 1 


to ui uiucr n \ Jtienc 

V(h) = <t v K\ - p 2 )/ n . 

irT • —* 

groups is taken with equal probabilities w„ / n / Wtr sam P le of k 
means of the n/k units in the ith select on W the by *• * the 

any function of the p’s and the x’s nfth.tV h ® r ,° Up) and b ‘ sha11 be 
g-ven split of the population, * x a r e unb , Then Pr ° Ve that *« a 
respectively and (k~> _ s -i)wi ' rf, blased estimators of Y and X 

estmator Coy <fc) where 7 *>'£ ~ « » an unbiased 

e « '7° its “Pectation ? J nit'll -7 Sx</k ' Considering y + 

^matorofKisgivenby, + ** _ « -biased 

of V unite' S Thefir“ d a 0m i Sample ° f n u “it s is selected f 
V and x, respectively r Selectl ° ns g^e a mean of v * populatlon 

selections form a rand GlVen the first a selection°/° r the character s 

Tha s, given the first a ^ Sainple fro m the popular 6 ^ " a) remainin g 

- «) and f J Selec - tl0ns ' (« - a£)/T T ° f N ~ a ^ 
denoting by f( z ) V a y^/{n — a \ e ^ a ) estimates (NX — 

;,*'•>« - «>• 

’ ^ the a selections 1“ fr ° m the first « 

/ _ ’ the ex pected value of 

n — a ~~ f( z °) NX 

is given by (Nv - - v, 71 - a 

niator of Y ic • a ^/(W — a) tr 

K 18 P r °vided by ^ H enc e prove f 

t ( a) iV - a . at aU unblased es 




f( z a)(x 


Jhl ^ (AT — 



exercises 

give the forms of ((a) for the particular „„ a? 

r cases when/fc) , s 

(!) ’.SfifS^O 

( 2 ) 

( 3 ) 

By forming f(a) for all possible values 0 f n a a 

what estimators do you get in situations (1) to (3)TboTC^ S ^ results ' 

39. A simple random sample _ )( . = j' 

from a finite population. The proposed ratio eltimale of f isV-TT 
where r,-= y'/aij and StOi = 1 . We sh»ll ^ * , ° lsy ~ l w ‘ r ‘X‘, 

the variances of y based on the auxili cTarlcWx' P) “ d V(S,M) 

x h • • • > x q , q > p. Jf the allocation of flip ■ i, .* - * > x v ai *d 
either case, prove that V{y\ v ) > V{y\p, q ). weights w < ls optimum in 

40. A population is divided into L strata rw • , 

each stratum with certain nrnhahnv a ^ Unit ls se ^ ect ed from 

estimates of ?, 1 "££‘”1 “ d *» ^ ™ *> ft * as 

this type, giving t T(i - i m ^penetrating samples of 

rp , giving y„ M* - 1, . . . , k). We shall use the notation: 


V.i 


■n >-n ’-iT-ii 


Prove that an unbiased estimator of Y is provided by 


f-j + ^ U = f„X + 

AC — 1 


k(y>t — f tt x,t) 
k - l 


0b tam the variance of this estimator. 

that an estimate of the variance of the combined ratio es 
ma tor based on a stratified sampling design (Sec. 5.18) is given by 

SAT a 2 (1 — fh)niT l (Sgh 2 + R 2 s x k 2 — 2 RrhS V hSxh) 

be relative variance being estimated by 

V , ( s u^ i s **> 2 o rhSuhSxh \ 

4 N h \l - f h )n h + p 2 YX ) 



tu RY 


NS — 
2P 


NS 


,h *0r> 

In case two units are selected from each stratum, show that 

Ar 2 «»** _ ( N *y *i - NhvmV 

* 2P~\ w ) 

AT 2 Sxb _ ( NkXh 1 Nh% hi \ 2 

2 P V 2% ) 

and NS rj ~^ = —* yAI ~ - N h x hz 

2YX 2f 2? 

Hence show that for n* = 2 and / — ? , . 

variance is given simply by ’ “ unblased estimate of the 

0 - /> l _ jVg^-^ A, 

£ a -dom sam . 

IH(*1 + I,)T m e T * *!? oomplete sample is r - o/J*- i/i**' wii1 ' 

“sume that the*« “? V* act “ a “y the sample means'^ V" ^ 

», which is 0 („-A .A 5 &re normal| y distributed k eshali 
£M*1 -Or A and that ^ = a + 6* 4_ , r d Wlth variance 

shall + where Z = »■ 

J ~ > Neglecting terl”rT ment SUch ‘hat EM t Sm' We 

A + 3A * + 15V, f r!f n ' < or ‘ewer, show that *v"u ‘ 1 = 

Prove that the bias of r f +3A + 15A 2 + 105 /js TI ,, x ‘ ~ 1 + 
0("-) and that F(r) = A ****** R is a(h *'gj 7 * ® "suits to 

On the other han^ c ^ + 8 ^ 2 + 69h 3 ) -l _i D l 15 ^ )> which is 
Quenouille) i s U8ed ■/. the . estimator < = 2r L’fvw + 15A ’ + 105A '>- 

F (0 = a J h . 4 ., ’ lta bias is a(6V + 901 ,, . . ( ^H r J + r t ) (due to 

than r W ( * + 441 + 12A '> + 5(1 + 2V+U W + h ; h 8 - 0 ^). Further 

-r ius/i ), which is smaller 


to be vi— . 18 'Snored, /W - jt , .. , 


; ^^cr element A . -‘“o^ers is seW+,aj ^ ei 

^ 6 V(Mf) where £ 2 Unblased estimator w u estlmate the 
'°n correctiont e f **<*, - ^ W ° U,db n%/nwit 

to be g,ven by 0 J/. ft Assu ming s!f th 1) ’ and th e fir 

! V "’ find the optim.f > °’ and the cost Vanance with 
t'mator i, a ***“« -» of the cluZ f nCti ° n to he C 

sr <z «: h ». ~ 


239 


ex£R c,sES 

I in a two-stage design the number of second-stage units M x is known 
**' eaC h psu in the population. A sample of n different psu's is selected 
f OI ^ e following manner. The psu’s are selected with replacement with 
3 f ( .. Selection continues till (n -(- 1) psu's are taken where the 
pp gt psu to enter the sample does so at the (r + l)st selection. The last 
aS u ig rejected. Every time the tth psu in the population is selected in 
p ^ e sar nple, the same random sample of m, second-stage units is taken. 
Denoting by n the number of times the tth psu occurs : n the sample, show 
by the conditional argument that 

nin - 1 ) Mi 2 


b r M o 


E 


r(r — 1) 


Mo 2 


E 


TiTj 


_ MjMj 

r(r — 1) M 2 


Mo = ZMi 


Hence show that y = ’2r x j jji/r is an unbiased estimator of the population 
mean per subunit, the variance estimator being 


2rj(yi - y) 2 
r(r — 1) 



45. We have a population consisting of L strata, N h psu’s in stratum 
h, and Mh-: second-stage units within psu h x . A simple random sample 
of n h psu’s is selected from Nh and a simple random sample of mh X second- 
stage units is taken from the Mh X . Estimate from, this design the ratio 
R = Y/X. How will you estimate the variance of R and the within-psu 
and between-psu components of the variance? 

46. A stratum contains N psu’s from which n are selected with replace¬ 
ment with probabilities v x = Mi/ Mo (i = 1, • • • > N), where M x is the 
number of subunits in the zth psu and 2 Mi = Mo- If a psu occurs in the 
sample X times, a wtr random sample of m\ subunits is selected from it. 

Prove that ti = £ \ x y x /n is an unbiased estimator of 6 — 2 vA, where 

is the mean per subunit in the ith psu and X,- is the frequency with 
w hich this psu occurs in the sample, y, being defined as zero when no 
sample is taken from the psu. Further prove that 

V(h) = A + B — C where A = rr 1 £ 


B = n-'ZiniOi - B ) 2 

f) 


V 


lf 5 “b S ampling is carried out independently every time a psu enters the- 


SAMPLING 


theory 


sample by taking a wtr random sample) of m subunits prove that l 2 ~ S g</n 
is an unbiased estimator of # and V(U) - A + B. Hence show that 
V(h) < V(ti). Show that in the first case 


V(k) = 


- y ) 2 _ 2 

n(n - 1) n(n 


¥ 1 ; [<-■»■ i - ~] 


while in the second case 


V(t 2 ) = 


S(y< - yV 

n(n — 1) 


47 ' A population is divided up into L strata, stratum h containing u 
m S each having M subunits. A simple random samole of „ * N ‘ 

selected from stratum h (h = 1 r\ i , pl f Uh P su s is 

subunits is taken from each. Esti™to f random sample of m 

to stratification by estimating thp • f °^ 1 ^ 1S sam P^ e the gain due 

which n - 2nj psuWs^ed a ?” a nH e f0r “ Unstratifi * d d -4s»ta 
random from each sample psu. d ° m a “ d m sub units taken at 

is taken and suhuidts^are selected at ICh d random sam P ] e of psu’s 
X The expected number of ^ ^ the * h P su if * “he 

Shn = which we shall c»n ?l t thesampIeisth en£;(5m) = 
Show that, given cost til Cali the e *Pected cost of th. 

estimate f - (JV/n)S ^ variance of the population 7oTal 

= v *v.. 


*r + ifr 

= N‘(±_l\ 

V * n) 


‘Hz 


ll 

* i 


Mi t (T w . 


T: ?*■-’ (■-'■■) 

respectivoi a Possible 

number 0 f su u y '. C °nsider now P ° S Und <T> °» stand f( 

mi * + m, , Ubunit s to be * T W an °ther sitnaf . 
minimum* • ‘ + m n , - a en ^om ea i 10n i n which m 0 is tl 
“ Vana nce is ’• ' *. for all , **£ »»ple of psu - s s0 tin 

' thls c ase, show that t’ 


V i = i\T 2 


(Hi\ 

\n 


a 2 -f 1 ATa 




1 N 


o 


fV 2 v* V* 


241 


exercises 

pro ve that V* - V x > 0, so that the sampling plan in which the sample 
size is a random variable gives a smaller variance. 

49. In a two-stage design one subunit is selected with pp to x from the 
entire population. If this happens to come from the tth psu, a wtr 
random sample of - 1 subunits is taken from the M, - 1 that remain 
in the psu. From the other (N - 1) psu’s a wtr random sample of 
(ji — 1) psu’s is taken. Subsampling of the selected psu’s is wtr simple 
random. Show that 2Miyi/2M t Xi is an unbiased estimator of R = Y/X. 

50. From a stratum containing N psu’s a sample of n psu's is selected 
following the scheme of Sec. 3.17. Each psu is subsampled in a known 
manner. Let Ti be an unbiased estimator of the psu total for y based 
on subsampling. Defining z n as z' n = [(N - 1)(N - 2) • • • (N - 

n 

n 4- 1 )]~ l TJ(pi\Pn • * ' Pm), prove that Z' = J = 1 is an 

1 = 1 

unbiased estimator of the stratum total. Obtain the variance of Z' and 
an unbiased estimator of the variance. 

51. A stratum contains N psu’s with measures of size Xi (i = 1, . • . ,N). 
One psu is selected with pps and another with pps of the remaining units. 
Let Ti (i — 1, 2) be an unbiased estimator of the total of the ^th sample 
psu, the estimator being based on subsampling from the psu. Then, 
prove that = JVpi.Z, - Tt + (1 - Pi)*Vp. are unbiased estimators 

of the stratum total and V{Z?) < V(Z 1 ). 

52. Suppose a population consists of N pan's out of which » are: sdected 

Sx ‘> where x, is some measure of si th ^^ the se cond 
for the fth psu there is an estimate T. 0> such that E,(T<) = Y„ 

and subsequent stages) of the total . P unbiased 

un = .f -- *,(«*>; Show ‘hat o ? bt ; n Z lf iprtL for V(t) and 
estimator of the population total. 

an unbiased estimator of F(F). 

* /. Un thp variate-values y 1 , y 2 > y 3> 

53. Consider a population of four uni , jf ^ population is to 

and y, arranged in ascending order o magnitude^ ^p^ ^ 
be divided into clusters of two uni s ea » , Qr (y u y 3 ) or (yi,y*) 

forming the clusters, one of the c us ers the population total 

fr the three situations. The o jec i e a b so lute errors (of the estimate 

by selecting- one cluster at random. g show that e% < ^ <ei- 

from the true value) are denoted y u > a » ’j^motonically increasing 
Hence show that, for loss method of forming clusters. 

"’1th e, the risk is a minimum for th . j an eV en number of uni 

Generalize this result to populations contain! g 



sampling theory 

■ ,w. the best way of forming clusters of two units is to take 

bypr0 Tid,ll from either end. 

7^!! ider the following multistage generalisation of the sample design 
si . Consiae ^ One psu Jg seleeted from each substratum 

described in E ^ sample psu i s subsampled in a known manner. 

BWon subsampling, let Us be an unbiased estimator of which is the 
f“l value of y for Ua, the jth psu belonging to the tth substratum. 
Given Uss, let VMl, < UUs) stand for the variance of Us and an unbiased 
estimator of it. Then show that 

f -|,s 

~ r.) + r ? if 


Show further that 


'SsHSH-nf 

E{ f 2 — v) = Y 2 E(v) = 7(f) 

Hence „ = ~ N (\ JlL _ *A , ~ D V UUs) 

N 2 - 2JV? \4 pi/p'.. / ' N 2 — 2Ni 2 4 pj* 

_ SAT,- 2 ~ N y Vz(t, 

N 2 - 2 Ni 2 4 pap 

ertin Per , / tr / a ^ m ’ a S !^ pler Var ; an ' 

NS V 2 for = n/2 = n 2 ~ ^/Pa,-) 2 , then E(u) = 7(f) 

SS. A population contains N n«n’« m u • - 

the zth psu. A samole of ? I 1 S> ^ ein S the number of subunits 

bilities p if 2p f = j jr ,, n psa s ls se ^ ec ted with replacement with prob 

Pendent subsamples of n^suhim^ ° Cm [ S m the sam P le X, times, X, ind 
being wtr simple random f 1 u eaC ^ are se ^ ec ted from it, samplii 
estimators of the population total '^ SUbsample - Consider the followii 


f = 

n i Pi 


SV 

7 = 1 r 


V he fet — Si is the 

SUbUttits - I» the latte ^72 SUbUnit ^ on a sample 

the me “ V. in based on the distil 



EXERCISES 


243 


units in the sample of subunits qu 

ance than t. ^ Sho " ‘hat }>- gives a smaller ^ 

56. In a multistage design the p su ’ s ar 

and sampling is done independently in the W ‘i h ° ut re P la «ment 

a is estimate made from the subsamnle H„„ ■ f = Z <> "here 
show that t = &, is an unbiased esZ^T/'z ^ ^ The “ 
7(0 = ftf) + W. where ' ^ TOth a variance of 


i - -SZ, <r,.! = £[(*, _ 
And an unbiased estimate of V(t) is provided by 

tfy) = [v(Z)] Zi _ 2i -j- SiTi&i 2 


where v&j is an unbiased estimated of V(2) in one-stage sampling and 
a, 2 is an unbiased estimate of <r, 2 based on sampling at the second and 
subsequent stages. Hence obtain the following rule for estimating the 
variance in multistage sampling. 

Based on one-stage sampling, get an unbiased estimate v(Z) of V(Z). 
In this replace Zi by Zi. To this add Siridi 2 where dr is an unbiased estimate 
of the within psu variance of 2 ,-. 


double sampling and re petitiv e surveys 

ST. When the initial random sample of size n' is taken for purposes of 
stratification ancf the subsequent subsample for collecting information on 
^ y within strata, show that an unbiased estimator of the variance of 2a*yA 
! (Sec. 7.8) is provided by 


n' y 17 , a*JV-n'W JLl2l * (fc - ? «*)'] 

*rT2,[( fc n'Y^TJn. + n'(N-l)^ y L >\ 


fuming that n h /N, and l/N are negligible as comparedI with urn^ 
Show further that the estimator of variance reduces to -(a «* /»*) 

% are small as compared with n'. 

f „r ,,nifc is taken from a population 
ss - A preliminary random sample o A random sub- 

an d information collected on p variates Xl > Xs > ’ r ^j Show that an 
sample of size a is selected on which y > s b ° 

anbiased estimator of the population mean for y • «"« y 


Jft =* l W ' Ui 

. t P ) are the means for 

^bere Ui = y - kiixi - *»') and & ** ^ “ ' * ’ ’-~ 


r 


^ sampling th EOrv 

244 T 

, i => ; s the mean of x% in the preliminary sample and fc/s arc 
r—being the weights adding up to unity. Show 

that V(0) = »■'«"'* where 

/ n.\ /. »\ (If If Q '- " ■ ~ ■ 


Aj 


= f 1 -S) So, + ( 1 n') 


and Suv is the covariance between u and v and 0,1, stand for 

y Xi, X 2 , . . • , £p, respectively. Show that an unbiased estimator of 

V(M) is given by 



n 

(n - l) -1 5 (y, - £) 2 



(n - l)"bS 




What would be the results if only one x -variate be used? 

59. In order to estimate the total of a stratum for the character y, an 
initial sample of n' psu’s is selected at random for which information is 
collected inexpensively on a character x. Then a subsample of n psu’s 
is taken with replacement with pp to x. Let 7\ be an unbiased estimator 
of the «th psu total 7,-, based on subsampling at the second and sub¬ 
sequent stages. Show that ( N/n ') ( x'/n)S(Ti/xi ) is an unbiased estimator 
of the stratum total Y, where N is the number of psu’s in the stratum and 
x' is the total of x in the preliminary sample. Prove further that the 
variance of the estimator is given by 

(AT - 1)-W (l - n-i ^ P< - Yj + (n')-W(iV - n’)S ,■ 

+ (N - l)-‘N(nn'r’ [(AT - n") £ V(T,) + (n' - l)X 7 


If the second sample is taken independently of the preliminary sample, 

show that the additional contribution to the variance due to subsampling 
is 


r 2 \n’ - N) ^ + **] to*- 1 1 


L \n iv/ 4 

60. A random sample of n' units is selected to observe the vari; 
while a random subsample of size n is taken to observe y. The oby 
to estimate the population mean Y. Show that u = x'y/x and ® - # 

)[n (n )1 (y fx) are Competing estimators of Y, v 

r ~ n S(yi/xi). Calculate V(u ) and Pfy) and obtain the condi 

ft* -W ii v 


245 


e xer cisES 

ci A population is sampled on two occasions. To use tfip rmt +• * 

; 7 9 2, an estimate of M\ is provided by the mean * 1 S°+1“ 2 
h e sample on the first occasmn. On the second occasion, when a propor! 
0» M°f the ( U " tS ‘^elected afresh .t ,s desired to make the best unbiased 

estimates of and the change A = M, - U, in such a manner that 
* „ j|? 2 - x. This means that Af 2 is of the form (e + \)x' ~(e4- 4 . 

7+ (X - c)9", whereas A is of the form - ( c + i)J» + + £, £ 

(1 — c)y"■ Determine the constants e and c such that V(M 2 ) -f 7 (a) is 

a minimum. 

^2. A survey is to be planned on two occasions to estimate 6 = aM x -f 
bMi where Mi, M 2 are the means on the two occasions and a,b are known 
constants. A sample of n units will be taken on the first occasion. On 
the second occasion a subsample of n\ units will be taken, as well as an 
independent sample of n 2 = n — ni units. If the total cost of the survey 
be represented by the function C = c 0 + cn + Cini + c 2 n 2 , obtain the 
best values of n and ni so that for a specified cost the variance of the 
estimator is minimized. 

63. A population is sampled over several occasions in the following 
manner. Each time an independent random sample of n units is taken. 
During the enumeration each member of the sample provides data both 
for the current period and also for the previous period. Let y h and x h -i 
denote the sample means for the hth and ( h — l)st occasions based on 
the sample taken on occasion h. Then the best linear estimator M h of 
the mean on occasion h will be of the form 

M h = y h - a n x h -1 + a A-i with = 0 

and il^-x = yh -1 - a h -iX h -2 + a h -iM h -2 

In order that M h have minimum variance, the condition is: 

Cov (XA-l fi*) = Cov (UK- 

We shall assume that V(y h ) = *7" on each occasion and that p is the cor- 
relation coefficient between consecutive periods. 

COV (Xh-l,J&h) = (P ~ a hW/ n ^ . 

Cov (s»_, A) = a h Cov A-|) - ‘ 1 

tv \ a v(Mh) = Cov (yh,Mh) - 

Hence show that = f>/(2 - f 1 , ^verges with the limiting 

(1 w»V7 n. Show that the sequem» l«4 of Mh is (i - 

v alue a = [l — (l — p 2 ) w ]/p and the limit g 

P!)V/n - . over time in this manner 

A population containing N units is sa ™P ^ gn( j each member o 
4t occasion h a random sample of«»units is ta 



246 


SAMPLINq 


theory 

the sample provides data for the current period as well as for the previous' 
period. P To use the notation of Exercise 63, the best estimator of M h will 
be of the form Mh = Vh ~ i + o, h Mh-i. Show that the coefficient 
a h is given by n*_ip/ + n A - n h a h -ip) and -a 2 (l ~ pa h )/n h . 

Suppose it is desired to take sample sizes n h so that V(M h ) = a */ g is the 
same for each occasion. Show that in this case n x = g and n h = 0(1 - 
and a h are independent of A for h > 1. 

65. Consider the sampling procedure of Exercise 63. Suppose the mean 
at occasion h is estimated after information on occasion (A + 1) relating 
to occasion A is assembled. Show that this situation is identical with 
Sec. 7.11, in which a sample of 2 n units is selected every time, half of 
which is the old sample and the other half selected afresh. Denoting by 
M[ the estimate of the mean on .occasion A, we have M' h = WhM h + 
(1 — Wh)xh, where M h is defined in Exercise 63. We know that V(M h ) = 
(1 — pa,hW/n, V(xk) = a 2 /n. The best values of W h and 1 — Wh would 
be quantities proportional to inverses of the variances of Mh and Xh. 
Hence show that 

M’ h = (2 — pah)~ l Mh + (1 — pdh)( 2 — pah)~ l Xh 

and V(M' h ) = - (1 - pah) (2 - pa h )- 1 

n 

66. From a universe of N units, 12 random samples are selected inde¬ 
pendently, each of size n. One of the 12 samples is enumerated in the 
first month of each calendar year, a second in the second month of each 
calendar year and so on. During the enumeration, each member of the 
sample reports data for the current month as well as for the previous 
month. Denote by y h and xh-i the estimates of the population total for 
occasions h and h — 1, respectively, based on information collected on 

occasion A. Show that an unbiased estimate of the population total Yh 
on occasion A is given by 


^ Vh K Xh _! + Kth-i = z h + Kth-i = K h ~'yi + £ 

i=2 


iv 


i 

all o = an d that the monthly correlations s 

K C t tI V( l= (1 - 2K > + K V and Cov fe, *-»,) = d 
12th or hifflf 1100118 ^ year * y correlations with yi and the terms involvi 
12th P-ers of K, show that for large h, we have V(W 

is (1 - (1 _ , p , further prove that the best value o 

‘ , t ./ ) 1/P and V ^ *en is given as .*(1 - 

sample contLs t unexprteXTaT edUre recommended for 

expectedly large units which will otherwise infl» te 



exercises 

247 

sampling variance. A 1 percent sample is selected 
month from January to December 1966 and inf , a umverse every 
A suitable cutoff is set at F„ and all lple unSd" t TT* °" * 
cutoff in all the months are assigned to one stratum (called the surnrt'e 
stratum . A new sample is selected for the month of January 1967and 
mformation collected on y. Any unit in this sample which Lbove the 
cutoff s ass gned to the surprise stratum. Information is collect on 
all units in the surprise stratum and their weight is diminished by the 
number of months involved (13 in this case). The procedure is repeated 
month after month. Using the fact that the sample units in the surprise 
stratum form a random sample from that stratum, show how this procedure 
may help in reducing the variance of the estimate for each month beginning 
January, 1967. How will you determine the point of cutoff? 

How will you modify this procedure if the technique is to be made 
operative right from the second month of the survey? 


NONSAMPLING ERRORS 

68. The measurement technique is such that observations y t are subject 
to a constant bias of u, that is, yi = z* + u, where z» is the true value 
of the unit. Show that the mean based on a wtr simple random sample 
will be subject to a bias of u. What is the effect of this bias on the 
variance of y and on the variance estimator? If measurements on an 
auxiliary variate x are subject to a constant bias of v, discuss its effect 
on the three estimators y + k(X - x), y + b(X - x) and Xy/x. 

69. The object is to estimate the true proportion P of units in a popula¬ 
tion belonging to a class A, by drawing a wtr random sample of n units. 
Due to errors of measurement some units get misclassified. Let P t be 
the probability that the unit U\ is classified as belonging to A. Let the 
model b ey ia = M\ + e ia so that given U i} Ei{y ia ) = M\ = P, and V 2 (y ia ) = 
PfQi. Assume that the errors of measurement are uncorrelated from unit 
to unit and that they average to zero over the whole population that is, 

Y, P*/N = P. Then prove that the variance of the sample proportion p 

would be given by 


V( V ) = 


i -/s(Pi - py 

n N — 1 


+ ^Z P ‘ Q ‘ <1 nN^-l P< > 


order to estimate the mean of a population a random sample of 
n units is randomly and equally distributed among m enumerators chosen 
0ut °f an infinite supply. Assume that yu the observation taken by 


SAMPLING THEORY 


, on unit j can be expressed as ya = M,- + a, + e, ik where 
f = 0, rMu) = S. 2 *“d «<* is ^correlated with c,„,. Then 
obtain the expected value and the variance of the sample mean. Assu m . 
ing the cost of the survey to be represented by C = cm + c 2 m + c 3 (nm)* 
find the optimum number of enumerators for which the variance of the 
sample mean will be a minimum for fixed cost. Discuss the special case 


Cl 


= 0 . 


71. A population containing N units is divided up into L groups with 
Nh units in the hth group. There are M h interviewers available to inter¬ 
view the units in group h. A random sample of n units is selected from 
the entire population giving rise to n h (a random variable) sample units 
from group h. Let n h be the number of units to be assigned to an inter¬ 
viewer in group h, there being m h (a random variable) interviewers 
required for the sample. Assume that the m h interviewers are selected 
at random from the M, and allocated at random to the sets of n h units 

bv fntemWr/ amP q • ASSUme fUrther * hat the resp0nse **«* Gained 
obtatrr 7 S “ gr0up h is a random enable and the responses 
dfeenl ladepeadent 1 b “‘ b ‘he interviewer and respondent are 
different. Then examme the estimator l$S W» for its bias and 

MSE. ' ' 


mg to the class < being 1 if tbT P°P ulatl0n belt 

otherwise, and j being 1 if thp •+ -n^ 1S ava ^ a ^^ e f° r interview ai 
same notation iphes to ImZ f “y 68 ” and 0 if “no.” ' 

over the variable. The“£ ^> a dot (•) denoting summai 

Proportion responding “ y !s”Tn the n P = <*» + *.>)/*, 

* un * t8 Produces n n available and P ° PU atlon ' A wr random samp 
7*7?. T Uable for interview Shn SP ^ dmg “ yes ” out of »*•> tb c ' 

bounder blas bei “« Wo./N)(N, /at tbat is a biased estim 

Further thlS bias ’ usin g the fact^th Obtain conserva 

S r , thafc the variance of \ *“/** Iies between 0 an 

Vl I ^ d ) > 1 — a is givei 

wb *e<.isthenormald “ 1 

exceeds 6 . al devi ate corre^ j. 

73. a po Du i + . 11 mg t0 the risk « that the errc 

ith sttaK 10n i8 SU PPo S ed tot,- 

Pil > Mil and of u nits that ^f ed Up into (w + 1) strata, 

’ * * ■ .*) rep r IL r ?f POnd at the ith call. 

e re lative weight, mean i 


EXERCISES 

249 

variance of stratum i, the corresnnn^rw, ~ , 

being P mi ,M m2 and S m2 \ Questionnaire ^ Uantlt ! e ® for the last stratum 
of n x units leading to n n responses and n nT mai ed t0 a random sample 
* 2 n 12 units is taken and con^ A subsample of 

responses and n 22 nonresponses A qhK i » mE1 . attem P fc giving n n 
for the third ^ at rand ° m 

of wn , i,ni+o * : ] 1 Atthe(m + l)st step a random sample 

EimJny) - P F\ ken / . 1 mformatlon collected on it. Show that 
( U/ 1 ) “ Pn > E M(^ny)] = P 21 , and so on. Hence prove that an 

unbiased estimator of the population mean M = | M«P n + P m2 M m2 i s 
given by * 

1 r m 

— [Vx + k-r l y 2 + k^kr'y* +•••+([] ki )~ 1 y m + (wnfc,)- 1 t/ m+1 ’l 

t 

IS * hetot f of the characteristic y for the respondents at the jth 
attempt. Derive the variance of this estimator. 

74. From a population containing N persons a simple random sample of 
n persons is selected and s random calls are made on each sample person 
Information on the variate y is collected from those persons who are 
available at least once. Denoting by Vi the probability that a specified 
person will be found at home, the chance that he is available at least 
once in S calls is 1 - q? where q< = 1 - Vi . p r0V e that the expected 
value of the sample mean would be 2^(1 - qf)/N. Compare this and 
the variance of the sample mean with the corresponding quantities 
obtained when the plan of Politz and Simmons (Sec. 8.12) is followed. 

75. There is a population of N bank accounts Aj (j = 1, 2, . . f N) 
with Yj being the true balance of A s . A wr simple random sample of n 
account holders is selected, and information regarding their balance is 
collected from them. Let xj be the balance reported by the jth account 
holder in the sample. The reported balance is matched against the one 
shown in the bank records, and the difference is called the response error 
c j- Because of the presence of matching errors, however, x,- may not 
always be matched against yj. Let Xj be matched against yj, with proba¬ 
bility p, and against y k (k ^ j), with probability q, so that p + (N - 

= 1. Show that, under this model, the mean response error e = Sd/n 
18 an unbiased estimator of the true value R = X — Y even though 
hatching errors are present. 

Show further that 

Vfe) = + 2qZ(X, - X)(Y, - ?) 


250 


sampling theory 


Hence obtain the variance of e and show that an unbiased estimator of 

T/Ys\ Jo hV e)^/I'I'li'fl 1)1* 


V{e) is provided by S(e< — e) 2 /i n ( n 1 )]. 

76.- In Exercise 75, given that the reported balance is Xj, the bala 
against which it is matched is denoted by Zj where 


nee 


and 


Zj - Vi with probability p 
= Vk (k 9* j) with probability q 
p + (N — l)g = 1 


♦ 

In random samples of n accounts selected with replacement, show that 

E(z) = Y V{z) = V(y) 

Cov (x,z) = {p - q) Cov ix,y) 

Cov ie,y) = Cov ix,y) - ip - q)V(y) 

Cov ie,z) = ip - q ) Cov ix,y) - F(i/) 

How will you interpret these results? 

L 7 g ::;rrrt P of un its b ei 0 n g - 

an estimate of P,where z t P. = «*,,/» a, 

survey t' i s repeated on the same abll ! ty Pj and 0 otherwise. The 

conditions, and the estimate obtfT UmtS Under im P r °ved essential 

certain that, under 

»ay be used as fife - ^ 

7ft mu KlJt P*’)’ 

m I he object is to estimate P 

to the group A. A wr samnl^ t proportlon °f a population belonging 
classified to group A or A ( no t A n * s sheeted and each unit if 
subject to error. The survev iq ° n . ? basis of an observation which is 
with the following results. e on the same units independently 


Survey 1 


Survey 2 



« + c 


& +d 


survey eCt The laSSlfied and ^correctly^ci 11 ^! that a Unit belongin| 

p«, < 1 * a ~ X T es ^ nding PtobabiUtie s aS fo fied ' reSPeCtiVely at * 
^classified in 5‘ Thus the chance \h J * Un t t belonging to 

urve y 1 isq x =s p . ^ ^ unit selected at ranc 

11 Qq i2, and the corresponding ] 



EXERCISES 


2S1 


bility in survey 2 is q, - Pq n + Q qa . Let { = ({l + w/2 . Show that 

A _ ® , & + C 

n 2n 

is a biased estimator of P and 

Bias(P) = q - (<;„ + tll )E(jf>) 

i 5 £(/>) - q 

Obtain upper and lower limits for the bias in P. 

79. It is desired to estimate the total area under wheat in a region con¬ 
taining N parcels of land. A sample of n parcels is selected with replace¬ 
ment with pp to x, area of the parcels. The area devoted to wheat in a 
parcel is determined by eye-estimation. Let xjt be the observation on 
the jth parcel at trial t. For a given parcel./, let E(xj t ) = Xj, 2 Xj = X. 
Defining d )t = (x,t — Xj) /pj as the response deviation, = Xj/pj — X 
as the sampling deviation, and B — X — Y as the bias, show that the 
mean square error of the estimate t = rr l S(x jt /pj) about Y is given by 


F «!+ 756 + 2 00 , 

n n 


/Sdjt S8j\ 

\ n n / 




= (response variance) + (sampling variance) + 2 (covariance 

between sampling and response deviations) + (bias) : 


Give explicit expressions for these terms. 

80. A random sample of households is staggered uniformly over the year 
and information collected on the number of births occurring during the 
year preceding the date of enquiry (which may be taken as the first of 
each month) along with the month of occurrence of the birth. Since this 
procedure involves a recall period of one year, it is feared that the esti¬ 
mated number of births over the year will be subject to considerable 
response bias due to recall lapse. The following method is used for 
diminishing the effect of recall lapse. Let x ik be the number of births 
reported in the tth month of the survey to have occurred in the kth pre¬ 
ceding month (k = 1, 2, . . . , 12). Assume that x ik = y + h + e ik , 
where y is the true number of births, b k = recall bias which is supposed 
to be a function of k f and e% k — random error with zero expectation and 
variance <r 2 . Then £ x ik is the number of births based on the ftth pre- 

i 

k 12 

ceding month, and h = V V xwjk will give the average monthly num- 

1 t ■ 1 

ber of births based on a recall period up to and including the fcth month. 
A curve of the form y = y„ exp (-a**) is fitted to the points (U), 


252 SAMPLING THE 0 R Y 

k = !.12, and the value of y 0 determined which gives the average 

number of births per month with no recall lapse (k = 0). Comment on 
this procedure of eliminating response bias, showing how the model pro¬ 
posed is actually used. How will you proceed to attach a margin of error 
to the adjusted sample estimate? 


MISCELLANEOUS 


81. From a random sample of n units a random subsample of units 
is taken and added to the original sample. Show that the mean based 
on the (n + n x ) observations is an unbiased estimate of the population 
mean but its variance is greater than the variance of the mean based on 
the original n units by the approximate factor 


( 


1 + 3- 
n 



82. Let t be an unbiased estimator of 0 with V(t) — k0 2 y k being known. 
Let the loss function be - 0) 2 where X(0) > 0. Prove then that 
the risk associated with t will be greater than that of t/(k + 1). Use 
this result to show that S(yi — y) 2 /(n + 1) is a better estimator of the 
variance of the normal population than Sfa - y) 2 /{n - 1). 

83. Let v 2 be an unbiased estimator of S v 2 . Then E(v - S y ) 2 = 

— E{v)] > 0, Hence prove that E(v) < S v . 

84. Let s and y be two random variables. Let As = x - X, Ay = y - Y, 
5s = Ax/X, 8y = Ay/Y,_D ij = E[(dxy(8yy], and £* = £[(As)WJ. 

ow that xy E(xy) — XY(hx + 8y + bxhy — Du). Hence prove that 

V(xy) = (X) 2 V(y) + (Y) 2 V(x) + 2 XYE n 

+ 2 XE n + 2YE n + E 22 - En 2 


Based on a random sample of n observations (s*,^), show that w = 

{nxy Ss,^/n)/(n - 1) i S an unbiased estimator of XY. Obtain the 
vanance of w. 


as. We shall define a sampling scheme as a procedure of selecting unit: 
* w “ po P ulati011 with predetermined sets of proba 
f?, ° r 6 Units at each of the selections. By a samp 1 ' 
f m f“ “ f bitrary coUect ion of samples s with a proba 
s^me ill”. sample ’ 2p(s) - !• It is clear that any sample 
n the nonula^ SampIe ***■ % introducing the null urn 

illion of l v (the occurrence of Which at a draw means the absence « 
selection of any unit then) and by augmenting samples by the null urn' 



EXERCISES 

253 

8 uit»blj-, prove that for any given sample n 

scheme uniquely which results in D. gD We can a sampling 

86. Suppose the sample estimator Sa/ti- is imKi Jr 

total Y. Instead of the n weights a, we'wish ^ d for the P°P“lation 

smaller than n) rounded-off multipliers b <h l** IUSt * <considerabl y 

M(Sny<) = So^i and 7(5^) is a minimum whip ,' ' < _f‘ such that 

taking the values b h b h . . . b k w\iho^ ■ ere r * a random variable 
of the rounded-off multipliers includes thela"* 1 ** 1 ‘I 1 *' M the range 

Xri) = * and 7(r<) is a mimmul if r t akl th?l? T *° W that 

i~ - » - <« - W(W t *» - -V 

$7. There are two long lists containing N and M names wifb n 

common to them. Samples of sizes n and m are selected!AT 

without replacement from the two lists nnrt ^ ed at random 

v , “ uae two llsts and d names are observed to 

be common to the two samples. Show that an unbiased estimate 

i ® ? g ! V “ 5 - ™d/(n m ), the variance of t> being 7^1 
NMD(nm)~ni + (n - l)(m - 1)0 - i)(jv - l)-U M - iy-.f _ ™ 

Show further that an unbiased estimator of 70) may betaken as ' 

1 nm(N — 1) (Jfcf — 1 )"| 

L(n-l)(m-l) l \ + D L 1 ~ fT - ,i )(m _ l)N j f\ 

Given the cost function C = c 0 (n + m) + Cl nm, show that the values of' 
w and m which minimize V(t>) for a specified C are given bv n = m if 
D < 1 + Co (N - 1)(M - 1)(C - c 0 )-\ 

There are two lists of names merged into one with R names. A wtr 
random sample of r names is selected and the names in the sample are 
classified as those belonging to the first fist or to the second. Let d be 
the observed number of names common to the two lists. Define the 
random variable 8j (j = 1, . . . , R — D) associated with the jth. distinct 
name as follows: 8j = 1 if the name occurs in both samples and 0 other¬ 
wise. Show that d = and E(d) = D cm- Hence show that 

an unbiased estimate of D is j^tfovided by t> = R(R — l)d/[r(r — 1)], the 
variance of t) being 

Consider the special case when the two lists are of the same size W. A 
wtr sample of size u is selectee from either list when the lists are separated 
an d let V l {£>) be the variant of the estimator. In the other situation 
the lists are merged into one., and a random sample of 2 u names is ta en 






SAMPUNq theob, 

to find out the number of names common to the lists. Let V 
variance of the estimator in this case. Show that V (ff\ __ be the 
provided that D — 1 < cq(W — 1) 2 (C — c 0 ) -1 . 1 ~~ < 0 

89. There are two frames A and B, with a known degree f 
and overlap, available for a population, it being assumed that C ° Vera ge 
m the population belongs to at least one of the frames T TT* Unit 
belong to A only (domain a), N b to B only (domain b) and w * ' 1 Unit s 
and B (domain ab). The unit U, possessing the value v wilfh ° b ° th 4 
value py, if it is in A and belongs to ah and given the value (1 6 ? Venthe 

prl'll B “) d y be i 0 T V 6 -, Thus the ; * 

are selected from “the two fraiTmd’tht^pXtion t°t T ^ U “ its 

* - *•«■ + W, + *2 + C P " total 18 estimat «* *>y 

Show that the variance of t is given approximately by 

f ( ?) = -i- b , (1 - o) + P wi + mi _ w + 9W) 

cost functicn‘c J I X ’ c l“ + "fn *’^ bei ” g ignored - Assuming a Unear 
and n B such that for a give/cos’t eXpressions f “ r ‘be values of p, n A 
f or the special case in wh“; fr^Tn * ° f ? * made a “hrimum. 
that N.t = N s ,p = I S i 1 haS 100 P ercent coverage so 

’ ‘ S “ ■ Sh0w tha * the best value of p is given by 

p* = 1 - Nb/Na 

90 Thecoeffi . Ca/cb ~ Nb/Na 

l and A ls Proposed to ustthis a Population is known to be 

Show that in wr simple rlnd 0 m matin S the population mean 

enor of **, + .sampling, P = nvy has a mean square 

en W V (' n + k 2 ). And for thisVT^ 6 is a minimum 

«• A sample of two^t^ = ^ + ^ < V<S) 

at each draw. 8 * ith un «qua] probabiUties without 

Tbe r 18 : Sc * w here c, the Potion total Y, the esti- 

rth draw 8 P p ablllty of selecting the^tlf^ 1 '?' 1 *° the rth draw - Let 

Yep- - i f r ° Ve tbat the condition t! b Un t in the population at the 
it”' 1 f0r •"** i. Hence show rtf* the ^imator be unbiased is 
an unhi 88 a 6 ™ 1 sai “pling system o tbM “° unbl ased estimator exists 
S-t eletS:ir tor if ‘boX, PrOV l furth - that there does exist 
probabilities he sam P le with probah*iv beme consists in selecting the 

“abilities Vi and the second with equal 



exercises 

X population consists of five units with selection probabilities p», 
proportional to a*, as given below: 




V i 
Vt 
U» 

Ui 

u . 


0.33 

0.19 

0.18 

0.16 

0.14 


The problem is to estimate the population total Y by selecting a sample 
of two units so that the selection probabilities n = 2p, are preserved. 

Further, the probability of selecting each of the nonpreferred sa ™P 
(*,«!), (*.,*.), («.,«.)» a nd is to be made as small as 0.02. 

What would be the probabilities of selecting the other five preferred 
samples? 

93 A sample of n different units is to be selected from a population of 
N units. It is considered desirable that the probability of the selection 

0 f m — — (^ 2 ^j uonpreferred samples be made as small as a/m each. 

The sampling scheme should be such that each unit in the population has 
a probability, ir t - = n/N of being selected in the sample. To effect this 

solution, is set equal to n(n - 1 )/[N(N - 1)1- This will give j 


linear equations in the P h where Pi is the probability of selecting the tth 
sample. Obtain the conditions under which this system of equations will 
produce a meaningful solution. Assuming the conditions to hold, how 
will you estimate the population total and the variance of the estimate? 


94. A survey is planned to estimate the proportion of a population 
belonging to the group C. It is, however, feared that respondents will 
not cooperate in answering the question involved, which is very personal. 
Thus the following procedure, in which the respondent furnishes informa¬ 
tion on a probability basis only, is adopted. The respondent is given a 
spinner with a face marked so that the spinner points to the letter C with 
probability p (known) and to C (not C ) with probability q — 1 — P> 
The respondent is required to spin the spinner unobserved by the inter¬ 
viewer and report only whether or not the spinner points to the letter 

representing the group to which he belongs. . 

A wr simple random sample of n respondents is selected to estimate P. 
Let be the response obtained from the jth individual where x* takes on 
the values 0 (no) and 1 (yes) only. 


256 . 

Then 

Show that 


Pr ( Xjl = 1) = P p + (i _ 

= 0) = P( 1 - p) + (i _ P) 

“ ^(P ~ tf) + g = #(*,-, a) 

Hence show that P = (&*/„ - 9 )/( p _ ?) is unbiased 


sampling 


t HEqry 


for estimating p 


m - i [a (' - ir - (r - !)•] 

,“ d “ J 3*— »<*»«*.«»«. 

(*t of all p0S8ible sam ple 3 q g n fa te d bv U ? nCe ' th<S sam Plo *Pa”e 

partitioned into subsets containing, • b , y samplm S scheme can be 
oe called sufficient. By defining a stltilticT Sa . m P les ’ the Partition will 
sample space into subsets each contai n! 1““ a partition of «>o 

zxl" “• ssz 

* ^4S2S£r»-~ * 

in as ° f i. he UIUt V ‘- The different mfit*r.’ V * where i is the 

s cTu V ng ° rder of their indices an 7 t f “ * ? mple of » are arranged 
t>. Let ,k 0rder s *atistic. Show that th ** ; • •) obtained 

on a sn < be an ’“i’iased estimator of th* ° rder stat istio is sufficient. 
Show tw a ° f 8ize "• ^et t ' = £/,| T ? f population total Y based 
hat l is unbiased for Y and ’ Where T ls& suffici ent statistic. 

nt) = F(o _ m _ 0 , 

so ^*at t' has a smaller w • 

Pi (i' S = P f ° 2 Se * popul ation is sampled^ 18 ^ Ra °- Bla okwell theorem.) 
Then, L’ 7 : • ’ N >’*Pi = 1 u‘ t n t r j epIaCemeatwitl ‘Probabilities 
selected fo^ P ““biased estimator of y*'“ Ct ' Iaitsllav e been selected. 
Then the nr 7 ^ the two distinct unit t " here 7.P) relate to the unit 

selection = p “/ff 7 that the sample s cont^'" 8 * 116 Sample S be *’ and, '‘ 
first = ~ Pi) and, similarlv «, amS * andj and i is the first 

Thus the' condv Bence PrU) L e Probability that j is selected 
sample s ) Probability ^(l ~ ft) + 1/(1 - *» 

statistic T ck (2 — pi — v \ n selection is i (given the 
’ Show ‘hat: P ‘>- Considering s ( bi ) as the sufficient 


EXERCISES 


has a smaller variance than ( and that 


and an unbiased estimator of V(t ') is provided fay ’ 

?«') = (yi _ - p ,)g - p, _ P() 

Vp< vj —- 


< 


I (* _ »V .. 

J. (1 -p'-p/) 


* 

d. A sample of n units is selected with renlacempnf * i u 

bilities from the population U, U, ^ , + qUal proba ' 

F . . u h u 2 , . . . , U N . The distinct units in 

1 , ™ pl are arran S ed m increasing order of their indices and their 

value for y ascertained. Show that the order statistic t (y (1) , . ) i s 

sufficient and that E(y\t) = Sy i/v where g = S yi /n and „ is the number 
of distinct units. Hence use the Rao-Blackwell theorem to prove that 

the estimator based on distinct units is superior to the one based on all 
umts in the sample. 


96. a. There is an incomplete list available of the units forming a target 
population and a sample is selected from it. This sample is supple¬ 
mented by an area sample to take in units not on the list. Let N = 
Ni + iV 2 + N 3 where 


N i = number of units accessible only through list sampling 

N 2 = number of units accessible only through area sampling 

N 3 = number of units accessible through both list and area sampling 


Let Y = Yi + Y 2 Y 3 . Based on the list sample, we have Y L = 

Tli + ?lz as an unbiased estimate of Y x + Y 3 . Similarly, we have 
= $ a* + ^A 3 as an unbiased estimate of Y 2 + Y 3 from the area 
sample. Show that 

+ fii + X?i. + (1-X)?*, 0<X<1 


is an unbiased estimate of Y. How will you obtain the best value of X? 
Interpret the case when \ = 1. 

b. The various units in a population are considered to be ordered geo¬ 


graphically so that it is possible to determine unambiguously the successor 
of each unit. There is a list (possibly incomplete) of the umts in the 
population. A simple random sample of units is selected from the list 
and the following method is used to take in units not on the list. * or 
each unit in the sample, determine its successor to see if it is on the hst 

If the successor is on the list, discard it. If it is no on ’ eway 

it in the sample; then identify its successor and proceed in the same 


25g SAMPLING THEORY 

until a successor is found to be on the list. How will you analyze the 
data thus obtained? 

97. There is a population containing three units Ui, u 2 , and w 3 . The 
object is to estimate the population total Y = yi + 2/2 + 2/3 by selecting 
a sample of two units. Let the sampling and estimation procedures be as 
follows:' 


Samplers) 

Pr(s) 

Linear estimator t 

Quadratic estimator V 

(Ui,U 2 ) 

(ui.tts) 

H 

H 

Vi + 2y 2 
yi + 2 y z 

2 /i + 2 y 2 + j/i* 

2/i + 2 j / 3 — 2/1* 


Prove that the admissible estimator 
unbiased and that 


t and the quadratic estimator 


t' are 


2 [(2/i + 2 y 2 y + ( yi + 2^3)2] __ y 2 
} = + V' 2 (yi 2 + 2 y 2 - 2y a ) 

"" r Tt e 


with 


where , is the ■ 1 ~ 1 '"-° 

selected and ^„ Tananc '> of the N - ,• . 

the cutoff point X’Z appl y to M(( from which the 
to take the sample wifi, ' e as60ci ated n Jm i, he problem is to < 
p W with i = m P i e painty. ^ ot «nits „ beyc 

show that and wtthi - m + ^ necessary condi 

Uld exce ed V(fi) wi 




exercises 

and an upper limit for X' is given by 


259 


X' <V +* 


* n 


where n and <r are respectively the mean and the standard deviation of the 
population. Suppose the rule is followed that as many units as can be 
identified to exceed n + far are included in the sample with certainty. 
What would be the value of fe? 

99. It is desired to estimate the mean consumption of cigarettes in a 
certain area for which lists of households and of individuals are available. 
The following procedure is adopted in order to give the subgroup of heavy 
smokers increased numerical representation in the sample (which a random 
sample of individuals will not do). Select a random sample of households. 
In each selected household select all individuals who are heavy smokers 
(e.cr. smoke more than 20 cigarettes per day) and at random one individual 
who is not a heavy smoker. Thus the procedure involves two-stage sam¬ 
pling with stratification at the second stage and 100 percent sampling of 

subgroup members. ._ 

How will you calculate an unbiased estimate of the population mean 

and its variance? Under what conditions is this procedure better than 

taking a simple random sample of individuals? 

100 There is a finite area S in which m points are located at random. 
We can trace a continuous path among the m points by starting at some 
point and connecting the points by line segments. The points can be 
connected in any order giving rise to ml possible paths along with their 
associated distances. Denoting by L the length of the shortest path, 
show that a lower bound for the expected value of L is given by 

1 r~r wi — 1 

E(L) > - VA —7=- 

* Vra 

where A is the measure of. the area S in which the m random points are 
located. 

101 A bag contains N balls of various colors, but the number of co ors 
is not known. The object is to estimate the number o colorsjK present 
in the bag by taking a wtr simple random sample ng 

Let X be the number of colors with «. balls in the sample. 
that ™ is not less than the maximum number , of baUs of any color in the 

bag, show that 


Q 

E(x<) = £ 


o 


Kj 


SAMPLING theory 


2M 


where Kj is the number of colors with j balls in the bag. Hence show 
that an unbiased estimate of K is given by 


n 


2? = X A&i 


where 

and 


A<- 1 - (-D 

o (0 = o(o — l)(a — 2) 


»=i 

. {N - n + i - 1)«> 
n {i) 

■ • • (a — £ + 1) 


a (0) = 1 


Give the status of the estimators: 


(«) 

(c) J f = N - ~ jj 

/ ^ . 


JV v 


(&) - > 
n 4* 


= Si* 


n(n - 1 ) = a ^ > So;,- 

if a' < Us* 


in the folIowingmanner^The 1 first unH^ * Stratum ntN “nits 
P< based on measures of size x t U = , o ‘ 3 seI ?‘; ted with Probabilities 

IS selected with probabilities ’ ’ ' ‘ > N )- The second unit u. 


where 
Show that 


JV 



TTU 


i + Anr^ + rri _) 


Hence prove that th 

selected * , 

? ssat s-aw.-s**:« -*. 
’«^ :r • s ~ x 

Proposed to a CC enf P tl<m of AT Vouc h ft the two sch emes. 

vouchers is P, nr 1 *. 



exercises 


261 


» nd that of accepting the population should be less than 0 if the propor¬ 
tion of incorrect vouchers is P, „ r higher. Assuming that the number 
of incorrect vouchers, d, in random samples of a, follows (a) the normal 

distribution, (b) the Poisson distribution, determine the sample size n and 
the acceptance number c. 


I04 \ 'y 1 '® ? urpose 1S to devise a single sampling inspection plan for a 
product being received in lots of N items. The expected quality is 
known to be P in terms of the proportion of defective items. The plan 
should ensure a specified value of the average outgoing quality limit 
(AOQL) at minimum cost of inspection. Assuming that the number of 
defective items d in random samples of n items follows the Poisson dis¬ 
tribution, show that the maximum proportion defective in the outgoing 
quality after inspection occurs for p = p\ when 



The cost of inspection will be given by 


C(P) = N — 



e~ nf, (nP) d 

d\ 


Find the sample size n and the acceptance number c which minimize the 
cost of inspection for a specified value of the AOQL. 

*05. Show that the variance of the product of two random variables X 
and Y is given approximately as 


V{XY) = (XY) 


. r to 

. X 2 



+ 2 


Cov (Z,7)~ 
XY 


Use this result to find the variance of the product estimator xy/X 
calculated from a random sample of size n for estimating the population 
m ean Y. Show that the product estimator has a smaller variance than 
the sample mean y when p < —(1/2 )CV(X)/CV(Y). 


*06. It is intended to estimate the population mean Y by taking a wtr 
f andom sample. It is found that auxiliary information on p + q variates 



*+X, Sjh-i, 


x i s already available but the variates 
”, C are negatively correlated with y. Show that the 


I 







262 


SAMPLING THEORY 


bias in the estimator 


is given by 


f X, , ”v' * - 

< i = l Wi 7r y + l Wi T< y 

1 Xx p+1 * 


'LWi = 1 


Cov (y,Xi) 


Find a large sample approximation to the variance of fi. 

107. There is a sampling scheme which permits unbiased estimation of 
Y and X. Denoting the estimators by Y and X respectively, show that 
the bias of the ratio estimator R = t/X satisfies the relation 


\B(R)\ 

*(R) 


< CV(X) 


Further, an approximate expression for B(R) is given by 

B(R) = RCV(X)[CV(X) - p(?,X)CV(?)] 

and E(R - RY * ± Wit) + W& ~ 2R Cov 


deferences to exercises 

W |TH remarks 



JOURNAL ABBREVIATIONS 

AISM: Annals of the Institute of Statistical Mathematics 
AMS: Annals of Mathematical Statistics 
S/S/; Bulletin of the International Statistical Institute 
JASA: Journal of the American Statistical Association 
JISAS: Journal of the Indian Society of Agricultural Statistics 
JRSS: Journal of the Royal Statistical Society 
RISI: Review of the International Statistical Institute 


1. Raj, D. and S. H. Khamis (1958), AMS, 29. Basu, D. (1958), Sankhya, 20. 
This shows that it is better to exclude repetitions than to include them. 

2- It always pays to possess information on some units in the population. 

Roy, J. and I. M. Chakravarti (1960), AMS, 31. It ta shown that ani¬ 
mator with variance proportional to the population vanance is hnearly mvanant 
&nd that there exists a lower bound for its variance. 


i 





M ^ nt ORY 

4 . Raj, D. (1954), Ganita, 5. The condition for pps sampling to be superior to 
equal probability sampling is given. 

5. Raj, D. (1958), JASA, 53. Another condition for pps sampling to produce 
lower variance than equal probability sampling is stated. 

6. Durbin, J. (1953), JRSS, B15. Gives the bias of the variance estimator 
appropriate to with-replacement pps sampling when sampling is wtr with unequal 
probabilities. 

7. Sen, A. R. (1955), JISAS, 5. Raj, D. (1956), JASA, 51. Gives a situation 
_ in which Yates and Grundy’s variance estimator is positive. 

8. Raj, D. (1956), Sankhya, 17. Here is a method of selecting a wtr sample of 
two units from a stratum wheD there is linearity between y and x. 

9. a. Sen, A. R. (1955), JISAS, 5. Raj, D. (1956), JASA, 51. Singh D 

TY1 SA \\ V 6 ' Ra °’ J ' N ' K - < 1961) ' ATSM ’ k Some situations are 
presented in which Yates and Grundy’s variance estimator is positive. 

of the wtr pps scheme is smlilenh^the ™riance oHte^odawt pp'Ichlme' 
estimator in wtr LmpUnf ria meqSl’ probibimii^^ ’ A “ altemative 

t ^ — * — « «* 

brings out'the suptriority^akiL ^ Censuses “ d Surveys." This 

sampling. * 8 plemen tary random starts in systematic 

15. Madow, W. G. (1953), AMS 24 p 

centered systematic sampling is better th/n ^ for m °notone populations 
«■ A simple method of ol r “ dom start sampling. 

«•* sitltioosT; ° b ‘ amng “ estimator of the variance in 

Seth, G. R. and J. N K "Rq /in* 

modem sampling with and withoutrepwL 5 ?^ 0 ' A26 ' Compares simple 
“• p athak, P. K. (1964) num . 1 for equivalent costs. 

Sennit with Unequal Probabilities i° s ’ in a ® h ° WS that the method of inverse 
h PPS) the second w ^h pps of th P enSe ’. equivalent to selecting the 
“• Pathak, P. k. (1964) nt /•! remainin g units, and so on 

population total in Blom ^nka, 51 ru mr , 

m inverse sampling with un P m i P&res two est imators of the 
J Ross ’ A. (1961), JASA 5fi 7„ Unequal probabilities. 

21 rr ° f the Variance es timator° CateS thg Sample to strata for minimizing 

Gives then „ 

means are to be estimated ** " ben not one but many linear 





INFERENCES to exercises with remarks 


22 . Stephan, F. F. (1945), AMS, 16. This paper gives E(l/n h ). The result 
shows that stratification after selection is about equal to proportionate allocation. 

23 . Cochran, W. G. (1953), “Sampling Techniques.” Shows that small devia¬ 
tions from the optimum allocation ordinarily do not bring about much loss in 
precision. 

24 . The variance falls off inversely to the square of the number of strata when 
the distribution within strata is uniform. 

25 . Raj, D. (1964), JASA, 59. Raj, D. (1953), Ganita, 4. An example of the 
use of truncated distributions in stratification problems. 

26. Raj) D. (1954), JISAS, 6. Gives an unbiased estimator of a ratio in strati¬ 
fied sampling. 

27. Raj, D. (1958), JASA, 53. Compares unstratified pps with stratified sam¬ 
pling under a mathematical model. 

28. Nanjamma, N. S., et al. (1959), Sankhya, 21. Gives a sampling system 
providing unbiased ratio estimators. 

29. Ghosh, S. P. (1963), AMS, 34. An extension of stratification problems to 
two variables. 

30. Folks, J. L. and C. E. Antle (1965), JASA, 60. A complete set of efficient 
allocations to strata is given when several characteristics are of interest. 

31. Williams, W. H. (1962, 1964), JASA, 57, 59. How to improve precision 
by using strata weights when the sample selected is simple random from the 
entire population is shown here. 

32. Raj, D. (1964), JASA, 59. Gives a condition for the usual approximation 
for the variance of a ratio estimate to be an understatement. 

33 . This is an unbiased ratio-type estimator based on interpenetrating 
subsamples. 

34 . Goodman, L. A. and H. 0. Hartley (1958), JASA, 53. Here is a situation 
in which the usual ratio estimator is better than the ratio-type estimator. 

35 . Murthy, M. N. and N. S. Nanjamma (1959), Sankhya, 21. Quenouille, 
M. H. (1949), JRSS, Bll. Provides an almost unbiased ratio estimator of R. 

.36; Hansen, M. H., et al. (1953), “Sample Survey Methods and Theory.” This 
is an alternative derivation of the variance of the regression estimator. 

37 . Williams, W. H. (1962), JASA, 57. This gives a general method of gen¬ 
erating unbiased ratio and regression estimators. 

38. Mickey, M. R. (1959), JASA, 54. This is another method of generating 
unbiased ratio and regression estimators. 

39. Olkin, I. (1958), Biometrika, 45. This shows that it is always better to 
include an additional variate in multivariate ratio estimation. 

40. De Pascual, J. N. (1961), JASA, 56. This is a form of a combined unbiased 
ratio estimator in stratified sampling. 



sampling theory 


- — 


41. Keyfitz, N. (1957), JASA, 52. Provides a short cut to tl 
of variance when two units are selected from a stratum. 

42. Durbin, J. (1959), Biorrietrika, 46. If x is normally distributed and the 
regression of y on x is linear, Quenouille's ratio estimator reduces not only the 
bias but also the variance. 

43. Cochran, W. G. (1963), "Sampling Techniques.” This is an excellent 
demonstration of the fact that the optimum size of the cluster is not a fixed 
characteristic of the population but depends on the cost structure of the survey 

«. Sampford, M. R. (J962), Biometrika, 49. A method of taking n distinct 
psu s with probability proportionate to size. 

45. Estimation of between- and within-psu components of the variance. 

46. Sukhatme, P. V. and R. D. Narain (1952), JISAS 4 Rai r> nos^ 

^ WWCh “ ^ £*■ d ° -i repeatin' £ ££ 

from a'tubs^pfinJdisfg 5 ^ ’ JISAS> 2 ' Estimates the S ain due to stratification 

the number of submit^ ii ’ a fando^^ai^bT 8 ^ a tW °" Stage desi S n in which 

T n N Umber " fiX6d * expected value 

Nanjamma, N. S. et al n Q wi a r case, 

an unbiased estimator of a ratio in multistat dLgm.^ & meth ° d ° f obtailli,1 8 

12 to multUtlge^am^U^ 1 ’ Th * S ' S an extension of the estimator of Exercise 
51 - Raj, D. (1966), Ganita 17 n- 

Sta * e *** »*ing use of condCblSv f ° rmin S estimates “ multi- 
“• Rai, D. (1954), JISAS ft r P rob ab.ht.es of selection only. 

rr rati03 in “oliistage sampfog an ° ther method o£ ““king unbiased esti- 

53 - Sethi, V.K o , 

- ° f —— 

from eac k subltmtumwit? 1 p &t randoin mto^^^ when the 

55. The e t* ™ PP t0 *’ ta and one P su ls selected 

stage designs. JIiSs > Bl5. a ml S Subsam Pled independently. 

”■ A estimator in do " ^ ““ VariSUCe in multi ' 

4®; J f A ’ 59 Tte“T 8 '? 3traUficati “ » obtained. 

" odof ^“ u “ a -- 

w e ° the initiai -i 1 double 



267 


pff£RENCES TO EXERCISES WITH REMARKS 


^o^TLtiroator to double-’sanfp'lfng de^“ * genetalUation of the unbiased 

61- Hans ® n 6t al * j 1953) ’.“ Sam Ple Survey Methods and Theory,” vol 2 The 

w^their?-^ b6tWeen the periods are esti- 

mated such that the change is the difference between the estimates for the two 

periods. 

6 2 , Kulldorff, G. (1963), RISI, 31. The sample size on the first occasion and 
the subsamplmg fraction at the second occasion are determined so that the 
variance of the estimated sum is minimized. 


63. Eckler, A. R. (1955), AMS, 26. Gives the best estimate of the mean at 
occasion h when information is collected from independent samples for the cur¬ 
rent period as well as for the previous period. 

64. Eckler, A. R. (1955), AMS, 26. This is a generalization of Exercise 63 to 
the situation in which differential sample sizes are used on different occasions. 

65. This shows how problems of 50 percent overlapping samples can be solved 
by the method of Exercise 63. 

66. Hansen, et al. (1953), “Sample Survey Methods and Theory.” Discusses 
a method of making monthly estimates when information is collected on the 
previous month, too. 

67. Woodruff, R. S. (1963), JASA, 58. Gives a procedure of reducing the 
variance due to the presence of unusually large observations when information 
is to be collected over time. 


68. When observations are subject to a constant bias, its effect on the simple 
average, ratio, regression and difference estimators is studied. 

69. Hansen, et al. (1961), BISI, 38. This shows that the variance of the sample 
proportion in the presence of uncorrelated response errors is smaller than the 
variance PQ/n normally used. 


70- Sukhatme, P. V. (1953), “Sampling Theory of Surveys with Applications.” 
This is an analysis of uncorrelated response errors under a specific model. 


Hansen, M. H., et al. (1953), “Sample Survey Methods and Theory.” 
Extends the analysis of response errors to the situation in which a random sample 
taken from the entire population is allocated to strata. 


• Birnbaum, Z. W. and M. G. Sirken (1950), JASA, 45. This gives bounds 
1 the bias due to noninterview. The sample size n is found such that the total 
r °r is small at a given probability level. 

• El Badry, M. A. (1956), JASA, 51. This is a generalization of the tech¬ 
ie which involves one mail attempt followed by a personal intervmw of a 
,Ia ple of the nonrespondents. 

; An offshoot of the plan of Poliu and Simmons in which the not-at-homes are 
° u 8ht into the sample without call backs, 

• Neter, J„ E. S. Maynes, and R. Ramanathan (1965), /ASd, 60. Studies 

• ^ect of matching errors on the measurement of response bias under a certam 
odel. 




SAMPLING 


theory 


n Neter, J., E. S. Maynes, and R. Ramanathan (1965), JASA, 60 Presents 
correlations between the reported values, the values matched, the response 
errors^ and the true values for the problem considered in Exercise 75. 

77 Hansen M. H., W. N. Hurwitz, and L. Pritzker (1964), “Contributions to 
Statistics,” Calcutta. The net difference rate is estimated when observations 
are subject to errors of response. 

78 . Bryson, M. R. (1965), JASA, 60. Upper and lower limits, for the bias of an 
estimate of the proportion of a population, are obtained when there are errors of 
classification and the survey is repeated on an identical sample. 

79 . A useful way of exhibiting the total error in terms of response and sampling 
variances and associated terms. 


80. Som, R. K. (1966), “Recall Lapse in Demographic Enquiries.” Thia is 
how demographers propose to diminish the response bias arising from recall lapse. 

81. Gives the effect of duplicating some units in order to keep the sample size 
as originally planned. 


82 . Goodman, L. A. (1953), AMS, 24. This gives a simple method of improving 
estimators in certain situations. 


83. This shows that an unbiased estimator of the variance gives a biased esti¬ 
mator of the standard deviation, the bias being negative. 

84. Goodman, L. A. (1960), JASA, 55. Here is a formula for the variance of 
the product of two random variables which are not necessarily independent. 

85. Hanurav, T. V. (1962), Sankhya, A24. A sampling scheme generates a 
unique sample design. The converse is proved here. 

86 . Murthy, M. N. and V. K. Sethi (1961), JASA, 56. Discusses a method of 
simplifying calculations by reducing the number of multipliers. 

87 ‘ E \ and G - J- Glasser (1959), JASA, 54 . Raj, D. (1961), 

; . eory is given for estimating by sampling methods the number of 

names common to two lists. 


mix R ?l’ (1 , 961 ^ JASA > 56 • Extends the theory of Exercise 87 to the case 
where the two lists are found to be merged into one. 

in which*!™ ? ^ ^ 1962 ^ ^ roc ' ^ oc - $ ec -> ASA. Gives theory for the case 
the second fram ^™ 8 f0r samplin & a population. As a particular case, 

establishments ^ & & 1St °* ver y ^ ar go establishments in a survey of 


superior 40 th 

K- Sukhar e B a Partl0Uiar daSS ° f eStimat0rs k “uilered. 

selecting a mmple <rftwrfntff. S ' Avadhani (1*65), AISM, 17. A method < 
pie of two different unite suc h that the probability of seleotin 


INFERENCES to exercises WITH 


remarks 


gome nonpreferred samples is reduce ♦ 2Si 

probabilities *,,• are given as- t0 a desired level t 

F ln th « example the 



93 Avadhani, M. S. and B. V. Sukhatme (1965) JISAS 17 A .a a , 
selecting a sample of n > 3 units 17 « A method of 

nonpreferred samples is reduced to a desired^evel 6 Pr ° babllity of selectin S some 

94. Warner, S L. (1965), JASA, 60. Presents amethod by which re, pendent, 
tonish .nfermatron on a probability basis only when they donotwant“d 
to the interviewer the eorreet answer to a certain question. 

*. Hajek, J. (1959), Casopis Post. Mat., 84. Pathak, P. K. (19641 AMS ’ll 
The concept of sufficiency is introduced and the Rao-Blackwell theorem used 
for improving estimators. A situation covered is that in which sampling with 
pps is continued till two different units are obtained. 


96- Hansen, M. H., W. N. Hurwitz, and T. B. Jabine (1963), BISI, XL. Some 

problems are considered when the lists used for sample selection are found to be 
imperfect. 


97. Godambe, V. P. and V. M. Joshi (1965), AMS, 36. An example is given 
showing that corresponding to a nonlinear estimator of the population total, 
there does not exist a linear estimator which is uniformly better. 


98. Glasser, G. J. (1962), RISI, 30. Gives a rule for determining the points 
of cutoff for sampling skew populations. 


99. Tulse, R. (1957), Appl. Stat., 6. A simple method of selecting the sample 
when greater representation is to be given to the subgroup of main interest. 


100. Marks, E. S. (1948), AMS, 19. Gives a lower bound for the expected 
travel among m random points. 

101. Goodman, L. A. (1949), AMS, 20. Here is a method of estimating the 
number of classes in a population by examining a random sample selected from it. 


102. Durbin, J. (1967), Appl. Stat., 16. Brewer, K. R. W. (1963), Austr. J. 
Stat., 5. A method of selecting two units from a stratum such that the proba¬ 
bility of inclusion of a unit is strictly proportionate to its size. 

MB. The parameters n and c of a single sampling plan are determined when the 
lot tolerance proportion defective and the acceptable quahty level are given. 


sampling theory 

270 

J „ F and H G. Romig (1944), Sampling, Impedei Tabkt. Apro- 
104 . Dodge, H- * ■ an samD ling inspection plan by which the cost of inspec- 
cedure of obtaining^ ® s p ec ified P value of the average outgoing quality limit. 

Ho" "“duct estimator is used when the correlation between , and . h 
negative. g Mudho lkar (1967), JASA, 62. A generalisation 

r^he^io and'product estimators when auxiliary informat.cn on sever.. 
x -variates is available. biag and the mea n square error of the 

107. Approximate e £P reS ® d f6 general sample design, 
ratio estimator are obtained for a gene 


appendix one 


report ON AN ACTUAL 
sample SURVEY 


A.1 GENERAL 

In order to illustrate some of the sampling principles presented in this 
nook, a description of an actual survey will now be given. 1 This survey 
w as conducted in Greece in April, 1962, as a part of a continuing series of 
surveys of employment and unemployment in the country. The object 
was to produce reliable national estimates of the number of persons unem¬ 
ployed, duration of their unemployment, the number employed by branch 
°I economic activity, etc., and the changes occurring in the size and com¬ 
position of the components of the labor force since the population census 
°I March, 1961. The data collected related to the week ending April 8, 
1962. Interviewers filled out questionnaires, specially designed for the 
Purpose, pertaining to members of the sample households aged 10 or more 


1 The survey was carried out by the National Statistical Service of Greece 
> with Mr. B. Helger and the author as UN consultants. Thanks are due to 
Ir ‘ P. Couvelis, Director-General, NSSG, for permission to draw on the draft report 
spared for the use of the Government. 


sampling theory 

■ „f the survey. All civilians who passed the night of April R 
^TaVrivate household were the persons covered by the survey 
institutional households-hotels boarding houses, etc.- were £ 
ride the jurisdiction of the enquiry and they formed about 3.4 percent of 
the total population in March, 196 . 


A.2 ADMINISTRATIVE REQUIREMENTS 

The sample for the survey was designed under certain administ • 
requirements. Except for Greater Athens the staff required fo r ^! Ve 
out lists of dwellings for sample selection and for supervision of ^ makln S 
of interviewers was very limited. And the area to be cover*? u° Vk 
supervisor had to be kept small for adequate supervision f d by a 
control of the various operations involved This J ? and Active 
than about 60 primary sampling units (pt-s^coulThf *“* “ 
sample from the countryside. ld be selec ted i n the 


« principal resources and materials 

e A ffi Sf d r g n mad :ii 0f th trr rial tha ‘ available to b , 

number of persons 1961 <*nsus proven 11 help ma ke ; 

This data was availnw commu ne or munioi at tbat tlJ ne was t 
Population of io 000 n ^ b * 0ck m Greater Athen ^ ^ ^ tbe C0Untl 
rural areas. Sketch p * m ° Te} and by ED ('em S and ° tber ci ties with 
r ** the bSstr7 a de ? the time of £~ ion Strict) in t 

on as the ‘, he —y- f 

“Pulation Census of S fT depende 

1951 had to be use 

OBKtN 

°‘her'urb* the C0Utlt ry i nto th 

^a> atei” foc e b r ®‘ of ‘he ctunt With 19 6l populat^'" 0 ’^ 1 strata 

£ ral »ad u rban ? re «ty). Th t 5 (send Urb!n ,, f 1011 Ceding t 
(an C V h ? racte risti artS ° f the coun Ratification i s ; J** rural area s, call 
ti °nai? in ? Stl ' ativ eunit^ tlle rural* dlSer Widely w f t ? rtaUt because t! 

ps « c :rr to iab , c 

Sej murban a j lr y beterogeneo’ 
11 rural populatio: 


REPORT on an actual sample survey 

273 

it is quite often partly plain and Dartlv * • 

somewhat industrial as well. And a rim- • ? a ; lnous; it is agricultural and 

un it which could be effectively superviaS ^ 1S the Single largest 

p ms ea by one survey supervisor. 


SAMPLING IN THE RURAL AREAS 

There were 147 eparchies in the count™ - 1 . 

their populations. Some of the eDarchic offered markedly in 

, £ 01 me e Parchies were too small to be efficient 

ps “ S; , For twelve of these were amalgamated withYeUtZg 

ones to orm 135 reasonably efficient psu’s. The next step was to stratSy 
the psu s m order to reduce the between-psu component of the variance 
The vanables used for purposes of stratification were: 1961 population of 
the eparclue (excludmg the urban part); proportion of the population 
dependent on agncblture/mdustry; and per capita cultivated area Since 
about 40 supervisors were available for this part of the sample and each 
supervisor could work in just one psu, the total number of psu’s to be 
taken into the sample was automatically fixed at about 40. In order to 
ge valid estimates of sampling error there should be at least two psu’s 
in the sample from each stratum. Thus in all 20 strata were formed. 
Une of the strata contained the two largest psu’s and another psu believed 
to be highly variable with respect to employment. All three psu’s in 
this stratum were selected with certainty. The other 19 strata were 
made of about equal population of 240,000. This was particularly con¬ 
venient and somewhat efficient. From each of these strata two psu’s 
were selected in the sample. The selection in each stratum was made 
systematically with probability proportionate to size (1961 population) 
a ter arranging the psu's at random. The reason for using pps sampling 
was that the eparchies differed considerably in size even after stratifica- 
won. Since the estimation of totals was of primary importance, the 
variability in the size of psu had to be controlled. With regard to sub- 
sampling from selected psu’s, it was considered desirable to introduce some 
md of stratification in view of the considerable diversity within psu’s. 
he second-stage units were the communes which were arranged by alti- 
u de within each sample psu. (Previous studies had shown that the 
e mpl 0 yment pattern in mountainous areas was different from that in plain 
ar eas.) Two independent samples, each containing two communes, were 
el ected systematically with probability proportionate to population from 
®ach sample psu. There were thus a total of 164 communes in the sample. 

he maps showing the ED boundaries for the sample communes were 
examined. Within a sample commune the ED’s shown on the map were 
ls ted along with their 1961 populations (in private households). From 
®ach commune a systematic sample of four ED’s was selected with proba- 
^hty proportionate to population. If the commune contained four or 


— 

SAMPLING THEORY 1 

174 ' 

fewer ED’., all ED’s in the commune were included in the sample. The 1 
advisor of the psu made a list of all properties (called ekodomies) i n 
the selected ED's by visiting all places there. A sample of properties 
was selected systematically with equal probabilities. The sampling frac¬ 
tions were worked out at headquarters in order to achieve a self-weighting 
sample for an expected sampling fraction of 0.5 percent. All households 
in the sample properties became the subject of further investigation by 
enumerators trained for the purpose. 


GREATER ATHENS i 

| 

In Greater Athens, which represents roughly one-fourth of the population 
of Greece, there was no need to select the sample in large clusters. This 
is the seat of the National Statistical Service of Greece, which could 
easily spare a number of its trained employees to work for the survey for 
a day or two. As a result, the sample here was spread well over the entire 
area. On the basis of geographic contiguity, the 57 municipalities and 
communes in Greater Athens were allocated to 20 strata. The strata i 
were made of about equal size (size being judged by 1961 population- 
census figures). Within a stratum a list was made of all census blocks 
arranged in a serpentine fashion along with the number of persons in 
private households in each block. The very small blocks were amalga¬ 
mated with neighboring ones to form block-clusters of a reasonable size. 
From each stratum two independent samples, each consisting of seven 
block-clusters, were taken. In each sample the block-clusters were 
selected systematically with probability proportionate to the number of - 
persons enumerated ; n private households during the 1961 census. A list 
was prepared of all dwelling units in the sample blocks. The sampling 
interval for dwelling units within block-clusters was so chosen that the 
sample became self-weighting for an expected sampling fraction of 0.5 
percent. The households in the sample dwelling units became the object 
of further study. 

OTHER URBAN AREAS 

The 62 municipalities and communes comprising the other urban areas 
were stratified on the basis of the proportion of population dependent on 
agriculture/industry and the rate of population growth (during the last 
decade). In all nine strata were formed, each stratum containing about 
200,000 persons. From each stratum a sample of two municipalities or 
communes was selected systematically with probability proportionate to 
the number of persons enumerated in private households during the 1 
rifinsiis. Before samnle selection the nsu’s within strata were arr&tt^ 



report on an actual sample SURVEY 

at random. For each sample nsn +v. Q 

list of all blocks, together with the TT WGre used to make a 

private households in them. The vpfv u, persons enumerated in 
neighboring ones. From each samnle LnT bl ° cks were clust ered with 
containing seven block-clusters were aeleJT mdependent samples, each 
bility proportional to the number of nerso 6 • systematlcall y witk proba- 
supervisor of the area went Tom H m Pnvate h ^olds. The 

dwelling units in each Z p £A t°l “ *° make a ,ist ° f 

units was taken from each block-cluster ib * Sainple of dwellln g 

determined that the entire sample becamp e ]f Sampkng mter val being so 
sampling fraction of 0.5 percent. ^ " Weightlng for an expected 


A.5 PERSONNEL AND EQUIPMENT 

The supervisory staff of the survey were the chiefs of the various field 
offices of the National Statistical Service of Greece. They are men with 
considerable experience in collecting data in different fields. They were 
given thorough training at headquarters in the purposes of the survey 
the procedures of listing, and the methods of measurement. They in turn 
trained the interviewers of the survey, who were local school teachers with 
previous experience consisting of the collection of labor-force data at the 
time of the census of 1961. While in the field, the supervisors checked 
the filled-in schedules for obvious inconsistencies, missing entries, etc. 
When the schedules arrived in Athens, a specially trained staff scrutinized 
them thoroughly according to instructions prepared beforehand and coded 
the relevant entries. The data were then transferred to punched cards 
and the tables produced. 


a -6 STATISTICAL ANALYSIS AND COMPUTATIONAL PROCEDURES 

Since the sample was made self-weighting with an expected sampling 
fraction of 0.5 percent, the estimation of population totals was simply 
made by multiplying the sample totals by 200. The sampling errors were 
based on the differences within strata of the two subsamples (eparchies in 
the rural areas, block-clusters in Greater Athens, and municipalities in 
°ther urban areas). This quick method of calculation of sampling errors 
may not be considered entirely appropriate as far as the rural and other 
Urban areas are concerned, where the psu’s were selected without replace¬ 
ment. In order to study this quick procedure as compared with the 
Unbiased, but difficult, procedure outlined in Sec. 6.4, calculations were 

f™. o -- J rT, ' 1 ‘ , ‘ A - 1 



SAMPLING THEORY 



27fi 


Table A.1 Ratio of the biased to the unbiased estimates 
of variance for selected Items 


Item 

number 


Males Females Persons 


1 

population of all ages 

1.254 

0.949 

1.114 

2 

population aged 10 and more 

1.155 

0.780 

1.001 

3 

the active 

1.178 

1.138 

1.086 

4 

the employed 

1.202 

1.162 

1.116 

5 

the unemployed: total 

1.185 

1.274 

1.195 

6 

the unempolyed: inexperienced 

1.473 

1.130 

1.513 

7 

population not active 

1.140 

1.230 

1.144 


the ratio of the estimated variances in the two cases. It will be seen that 
by and large the biased estimator overestimated the variance, the average 
overestimation being of the order of about 16 percent. To this extent 
the estimator is conservative and safe to use (Raj, 1964). 


A.7 PRECISION OF THE SURVEY 


Based on the quick estimator of variance, which slightly overestimates the 
true variance, sampling errors of a large number of items were calculated, 
a e A.2 gives the coefficients of variation of a few selected estimates of 


Tabic A.2 Relative sampling errors of selected items, percent 


Aged 10 

All ages or more Active Employed ployed 


Total 

unem- 


New 

unem¬ 

ployed 


Non¬ 

active 


Males 

Females 

Persons 

Males 

Females 

Persons 

Males 

Females 

Persons 

Males 

Females 


1.7 

1.1 

1.3 

2.0 

1.7 

1.6 

4.0 

3.0 

3.5 

2.4 

1.5 
i a 


1.7 
1.0 
1.3 

2.2 

1.8 

1.7 


3.8 

2.8 

3.2 

2.6 

1.2 
i a 


All Greece 


2.1 

2.2 

3.0 

3.3 

1.9 

2.2 


Greater Athens 


2.2 2.1 

3.2 3.1 

1.8 1.8 

Other urban areas 

34 3.0 

95 10.1 

46 4.3 

Rural areas 

32 3.4 

37 3.9 

o n 


6.7 

7.3 

5.5 

7.6 
7.1 

5.6 

12.3 

16.3 

12.2 

16.4 
18.2 
14 A 


12.8 2.8 

7.5 2.1 

7.4 l.» 


17.1 

9.3 

9.1 


3.9 
2.0 

1.9 


25.6 

9.8 

10.2 


5.5 

3.0 

3.2 




REPORT on an actual sample survey 


277 

the population of Greece classified aernrd; * 
ness. It will be found that except for th 8 t0 actlveness 0T nonactive- 
V ery small subclass) most of the otw e lnex P enenc ed unemployed (a 
sampling errors. stimates were subject to small 


A.8 QUALITY OF RESULTS 

Some indication of the quality of the results obtained from the survey can 
be had by making a comparison with the census data. The census was 
taken at about the same tame 1 year ago. Table A.3 gives comparative 
estimates from the two sources, the census figures being based on a 2 
percent sample taken for making advance estimates. The comparison is 

Table A.3 Population by activity status, as estimated by April, 1962, survey 

mi _111_ ■■ 



April 

March 


Standard 



1962, 

1961, 

Difference 

error of 


Item 

sample 

sample 

( d ) 

difference 


(000) 

(000) 

(000) 

(*) (000) 

(d/s) 



Males 




Population aged 10 or more 

3127.8 

3103.5 

24.3 

55.3 

0.44 

Active 

2465.2 

2390.3 

74.9 

52.2 

1.43 

Employed 

2345.2 

2266.8 

78.4 

53.0 

1.45 

Unemployed: total 

120.0 

123.5 

-3.5 

8.3 

-0.42 

Unemployed: inexperienced 

24.6 

41.4 

-16.8 

3.4 

-4.94** 

Not active 

662.6 

713.2 

-50.6 

19.6 

-2.58* 

Population of all ages 

3940.8 

3867.3 

73.5 

66.6 

1.10 

Females 




Population aged 10 or more 

3623.4 

3509.6 

113.8 

36.7 

3.10** 

Active 

1618.2 

1189.9 

428/3 

49.5 

8.62** 

Employed 

1517.0 

1076.2 

440.8 

50.1 

8.80** 

Unemployed: total 

101.2 

113.8 

-12.6 

7.7 

-1.64 

Unemployed: inexperienced 

51.8 

46.2 

5.6 

4.1 

1.36 

Not active 

2005.2 

2319.8 

-314.6 

42.5 

-7.40** 

Population of all ages 

4365.4 

4238.1 

127.3 

48.5 

2.62* 

All persons 

138.1 

503.2 

84.7 

76.8 

1.63 

6.55** 

Population aged 10 or more 
Active 

6751.2 

4083.4 

6613.1 

3580.2 

Employed 

Unemployed: total 
Unemployed: inexperienced 
Not active 

Population of all ages 

3862.2 
221.2 

76.4 

2667.8 

8306.2 

3343.0 

237.2 

87.5 

3033.0 

8105.4 

519.2 

-16.0 

-11.1 

-365.2 

200.8 

85.9 

13.2 

6.0 

50.8 

109.1 

6.04** 

-1.21 

-1.85 

-7.19* 

1.84 


and axe civen here for 


278 SAMPL, NG THEORY 

revealing. The survey gave a far higher rate of labor-force participation 
than the census. The explanation seems to lie in the fact that “employ, 
ment” is an elusive character. For many persons (especially women) 
attachment to the labor force is not a fixed fact but an attitude that may 
vary considerably according to the manner the questions are asked and 
the circumstances prevailing at the time of interview. The more careful 
and detailed procedures of the survey helped in classifying such persons 
better than the general census. 


A.9 EFFICIENCY IN RELATION TO OTHER SAMPLING DESIGNS 


The results of a survey often provide information on the efficiency of the 
sample design actually used in relation to other sampling designs which 
ought have been used. Based on the present survey, a comparison was 
made with the following alternative methods of selection of two psu’s ner 
stratum in the principal strata of other urban areas and the rural areas 


Scheme A Selection of psu’s with replacement with pp to nomiUtin. 

; t - 


COMPARISON WITH SCHEME A 
Using the formulas given in Spp 

attained through the without rpni ’ e amoun ^ gain in precisio 
P-ed with with-r s <*eme of the survey as con 
selected items. The results are P f; v ” P ~ ng Was oal °ulated for a fe 
It will be seen that th i ? “ Table A.4. 

Percent, the range being —u percent t^oc ' D prec ' s * on averaged about 

compared tot? 16 rec * uct i°n turned out to k 

tion in thp k overal * relative tain ; e about 21 percent. Thus, a 

reason that f, Ween "P su component wa Precision of 8 percent, the reduc 
at the Action in the to7al S ^ than twice - high. Th, 

variance was low was that th< 


REPORT on an actual sample survey 


27S 


within-psu component (Table A.6) was generally more important than 
the between-psu component. This was because the psu’s had been sub¬ 
sampled at a very low rate. 


Table A.4 Ratio of the estimates of variance based on with-repiacement 
sampling to without-replacement sampling 


Item 

number 

Item 

Males 

Females 

Persons 

1 

population of all ages 

1.127 

0.974 

1.057 

2 

population aged 10 and more 




3 

the active 



1.043 

4 

the employed 

BUS 


1.058 

5 

the unempolyed: total 


1.137 

1.097 

6 

the unemployed: inexperienced 

1.236 

1.065 

1.256 

7 

population not active 


1.115 


Table A.5 

Percent reduction in the between-psu component of 



variance (base: estimated between-psu variance for wr sampling) 


Item 





number 

Item 

Males 

Females 

Persons 

1 

population of all ages 

20 

-12 

12 

2 

population aged 10 and more 

14 

* 

0.2 

3 

the active 

16 

17 

10 

4 

the employed 

17 

19 

12 

5 

the unemployed: total 

18 

20 

16 

6 

the unemployed: inexperienced 

36 

86 

51 

7 

population not active 

13 

34 

20 

* Base zero. 




Table A.6 

Estimated between- and within-psu contributions to the 


total relative variances (coefficients of variation squared) 



Item 

Males Females 

All persons 

number 

Between Within Between 

Within 

Between 

Within 


1 

0.00017 

0.00017 

2 

0.00018 

0.00021 

3 

0.00026 

0.00029 

4 

0.00030 

0.00031 

5 

0.0034 

0.0049 

6 

0.010 

0.014 

7 

0.00052 

0.00063 


0.000046 

0.00015 

0.0000061 

0.00016 

0.00035 

0.00068 

0.00039 

0.00075 

0.0068 

0.0055 

0.00014 

0.013 

0.00014 

0.00050 


0.00010 

0.00014 

0.000073 

0.00016 

0.00023 

0.00035 

0.00026 

0.00038 

0.0038 

0.0035 

0.0026 

0.0079 

0.00017 

0.00042 







210 


SAMPLING THEORY 


COMPARISON WITH SCHEME B 

Suppose that two psu’s are selected from each stratum with probability 
*• — - a - -Let Ti be an unbiased estimator of the 

i i i i ___ 


r _ , %/ 

proportionate to aggregate size. Let Ti be an unbiased estimator of the 
ith psu total Y i} the estimator being based on subsampling. Let V(T { ) = 

_ 2 ±2 nn nnKiocnrl Dcfimafnr nf Thpn an _ i • 


-th psu total Ti, the estimator oeing oasea on suDsampnng. V (Ti) = 

tf < 2 and <r , 2 denote an unbiased estimator of <rc. Then an unbiased esti¬ 
mator of the stratum total would be 

STi 


t = X 


with a variance of 


Sxi 


">'>!<»■<-<.> (ns-*) +ir y£ 


+ °V 


,2 


H- Xj 


4 

Let t„. be the probability with which psu’s i and j are actually selected in 


sf^iK [fcrs - •) c' + w 

- 2 («■ _ 1 2_\ m m 

' Xi + I,/ ’’ 


+ *t + t, 




Table A 7 J 

techniques P including°^ e (equafprobab-^r 11 PPaS SampIin g and other 

P (equal probability) sampling. It i s dear that 

- with other techniques 


Between-p 

varianci 



Males of all ages 

Sh“' edl0ormore 

number active 
Number employed 
Number not active 
Unemployed: total 

^experienced 


8.5 
8.0 
6.8 
5.8 
3.3 

1.6 
1.0 


3.2 
2.4 

2.3 
2.0 

2.4 
1.7 
1.1 


2.7 

3.3 
3.0 
2.9 

1.4 
0.95 
0.90 


3.9 

5.3 
4.6 

4.3 
1.8 
0.61 
0.71 


--- u. 5IU U. 

10 ‘he fact that p pas 8 a “ d pps aam Piing e mter mediate between 
Wlth PI« and the aeo T Un 8 isec luivalent.T h ,l COU d possiblv be * 

nd Witb probability* Sele0ti ° n ° f the “ 





R EPORT on an actual sample survey 


281 


COMPARISON WITH SCHEME C 

Now let the first psu be. selected with pps and the second with pp to Uie 
sizes of the remaining units. If t be the probability with which psu’s 
i and j are actually selected in the survey, it is fairly simple to see that an 
unbiased estimator of the between-psu contribution to the variance would 
be 

IA _ Vi + pA ViVi 17 Ti _ TjV _ /tf 
xA 2 ) 2 [\ Vl pj w vN. 

This formula was used to obtain entries in Table A.8. The average 


Table A.8 Percent reduction in the between-psu variance 
(base: estimated variance for with-reptacement pps sampling) 

Males 


Females 


Persons 


L IVIIV 

Population of all ages 

14.0 

11.9 

.... 

5.1 

5.2 

Population aged 10 or more 

13.1 

. 15.4 

11.4 

The active 

13.4 

16.3 

12.0 

The employed 

10.7 

13.0 

10.5 

16.4 

ftVkAIlf 1 1 

The unemployed 

Population not active 

10.8 

20.1 

i 1_ 


u * ppn-DSU variance worked out to be about 11 
jduction in the betw 


ovstematic sampling with 

IW. V JXZ-j" —* ' 

bility proportionate to s.*e 

.. 59 . 



APPENDIX TWO 


PRINCIPAL NOTATION USED 


i 


Wherever the notation used is not explained, it should ordinarily have 1 
the following meaning. ^ 


(<*v) 

B or 0 
b 

B(R) 

fa 

C 2 (U,W) 

Cx(17,W) 

Cov 

cv 

E 

Ei(U) 

&(i/) 

ep 


the matrix of coefficients de¬ 
population regression coefficient 
sample regression coefficient 
bias in the estimator R 

the fourth moment divided by the square of the second 
moment 

conditional covariance of U and W given Hj 
covariance of U and W over all Hj 
covariance 

coefficient of variation 
expected value 

conditional expected value of U given Hj 
expected value of U over all Hj 
equal probability sampling 



principal notation used 


2S3 


/ 

h 

MSE 

MVU 

N 



T <i 

R = Y/X 


p 

S 


S' 


Sy* 

a 2 



01 Hi 

V 

Hu) 

HU) 

Wr 

wtr 

K = Nh/N 

i! 

Y 

? 


is estimated by 
approximately equal to 
sampling fraction 
stratum h 
mean square error 
minimum variance unbiased 
number of units in the population 
number of units in the sample 

number of combinations of N things taken n at a time 

population proportion 

sample proportion 

probability proportional 

probability proportional to size 

probability proportional to aggregate size 

probability 

primary sampling unit 

probability that the unit Ui is selected in the sample 

probability that Ui and U,- are both included in the sample 

population ratio of y to x 

correlation coefficient 

summation over all units in the sample 

summation over all different pairs of units in the sample 

2(F» - ?)>/(N - 1) 

S( yi - y) 2 /in - 1) 

positive square root of variance 

2(Yi - YY/N 

summation over all units in the population 

summation over all different pairs of units in the 
population 
U given Hj 
variance 

conditional variance of U given Hj 
variance of U over all Hj 
with replacement 
without replacement 
weight of stratum h 

mean of the preliminary sample of size n' 

population total for the character y _ 

population mean for y 
sample mean for y 
an estimator of Y 




appendix three 


RANDOM NUMBERS 


01 



02 08 04 05 06 07 08 09 10 11 12 is ^ 


25 

19 

17 

50 

50 

46 

26 

54 

61 

41 

41 

91 

88 

83 

97 

50 

71 

35 

65 

67 

15 

96 

17 

27 

35 

82 

80 

77 

21 

48 

84 

49 

72 

93 

48 

85 

12 

09 

36 

72 

81 

06 

49 

57 

40 

54 

64 

88 

97 

07 

43 

79 

37 

60 

96 

75 

80 

07 

51 

15 

59 

55 

24 

40 

71 

81 

93 

03 

03 

60 

50 

24 

44 

84 

14 

02 

13 

51 

36 

08 

02 

99 

65 

46 

62 

81 

28 

56 

90 

81 

19 

83 

33 

85 

65 

91 

68 

33 

24 

05 

75 

46 

93 

05 

64 

28 

40 

31 

45 

53 

96 

36 

21 

23 

47 

38 

68 

53 

19 

00 

78 

78 

51 

53 

72 

74 

66 

96 

71 

70 

61 

05 

98 

46 

24 

17 

92 

11 

04 

92 

55 

69 

47 

19 

10 

36 

47 

75 

17 

81 

21 

31 

84 

98 

35 

04 

66 

64 

83 

34 

75 

05 

83 

68 

55 

63 

72 

35 

45 

48 

17 

48 

46 

21 

44 

88 

44 

33 

02 

47 

97 

47 

49 

A Af 

91 

93 

73 

14 

15 

01 

45 

p A 

42 

46 

06 

93 

60 

41 

50 

18 

69 

56 

74 

73 

10 

16 

51 

02 

89 

87 

66 

41 

43 
52 
01' 
94 

44 

*>/ 

73 

69 

~rs 

i 46 
42 

70 

34 

^2 

63 

06 

73 

01 

69 

. it 

32 

' -4k 

19 

65 

53 

49 

* » 

41 

33 

78 

19 

17 

04 

19 

68 

98 

32 


92 

62 

41 

27 

66 

85 

60 


30 

32 

75 

59 

03 

58 

58 

( U 

45 

73 

09 

17 

60 

68 

38 

OO 

os 

28 

97 

11 

26 

72 

02 

88 

wu 

96 

66 

75 

82 

36 

33 

77 

97 

35 

73 

04 

02 

03 

10 

81 

34 

44 

69 

03 

12 

94 

45 

86 

74 

66 

39 

46 

33 

42 

41 

29 

83 

73 

80 

49 

12 

61 

68 

00 

44 

58 

02 

42 

53 

38 

35 

05 

67 

73 

95 

71 

17 

46 

16 

45 

72 

36 

51 

84 

51 

20 

85 

22 

94 

38 

95 

58 

41 

50 

80 

91 

11 

62 

17 

85 

77 

15 

53 

18 

87 

75 

39 

09 

20 

73 

52 

84 

82 

81 

84 

57 

60 

99 

82 

84 

93 

66 

50 

06 

54 

28 

00 

56 

78 

63 

90 

79 

03 

63 * 

27 

02 

60 

44 

64 

67 

41 

35 

00 

84 

20 

51 

17 

17 

89 

52 

52 

65 

59 

36 

63 

23 

35 

15 

03 

79 

56 

48 

99 

77 

96 

71 

72 

67 

99 

24 

18 

40 

58 

65 

35 

98 

48 

02 

53 

51 

48 

26 

41 

11 

16 

45 

18 

99 

41 

51 

94 

64 

83 

03 

04 

12 

38 

93 

25 

03 

29 

72 

47 

02 

70 

30 

96 

01 

06 

30 

09 

31 

29 

52 

49 

68 

82 

39 

51 

57 

21 

54 

95 

58 

76 

46 

05 

13 

87 

13 

61 

08 

73 

29 

60 

25 

42 

09 

50 

42 

45 

01 

nO 

62 

22 

41 

29 

65 

24 

43 

2* 

58 

74 

08 

05 

11 

38 

94 

IP 

09 

56 

83 

25 

40 

01 

22 
■ A 

U* 

02 


67 80 84 09 69 67 52 






, AN DOM 


numbers 


285 


16 

81 

60 

53 

02 

95 

00 

67 

11 

75 

16 


17 18 19 20 21 22 23 24 


58 

25 

12 

68 

68 

32 

16 

23 

25 

67 


85 

84 

75 

01 

53 

t 

10 

84 

91 

43 

93 


33 

42 
59 
17 
92 

43 
57 
28 
39 
59 


16 

22 

76 

09 

82 

45 

42 

97 

13 

86 


25 

81 

18 

46 

46 

80 

49 

70 

27 

17 

44 

44 

96 

11 

09 

46 

90 

22 

50 

50 

35 

48 

16 

10 

07 

50 

01 

66 

62 

10 

13 

35 

98 

13 

29 

68 

37 

23 

74 

77 

40 

43 

78 

99 

64 

90 

37 

63 

74 

14 

04 

84 

87 

41 

64 

43 

95 

90 

88 

46 

94 

20 

01 

52 

38 

64 

04 

67 

90 

38 

91 

89 

73 

11 

07 

24 

43 

43 

01 

91 

77 

67 

34 

95 

86 

09 

91 

95 

96 

96 

36 

59 

12 

33 

44 

67 

03 

48 

83 

77 

87 

07 

87 

94 

15 

70 

83 

47 

08 

44 

35 

61 

24 

35 

08 

33 

90 

47 

53 

07 

10 

86 

00 

20 

21 


11 

94 

42 

00 

11 

44 

18 

34 

14 

81 

96 

99 

82 

53 

10 

38 

61 

92 

47 
30 

03 

27 
82 
25 
29 

48 
99 
59 

28 
15 

70 

92 

63 

64 
25 


87 

38 

73 
38 

96 

48 

97 
06 
29 
53 

68 

43 

38 
83 
67 

47 

37 

37 
23 
64 

89 

34 

74 

44 
69 

33 

27 

13 

85 

39 

33 

03 

55 

57 

38 


12 

96 

48 

12 

03 

02 

25 

48 

63 

07 

34 

11 

91 

95 

28 

86 

85 

14 

89 

66 

57 

43 

59 

69 

79 

23 

54 

33 

77 

38 

87 

01 

43 

02 

66 


17 

52 

95 

31 
47 

29 

03 

44 

79 
69 

08 

36 

73 

82 

66 

17 

44 

25 

80 
72 

82 

61 

52 

32 
89 

60 

40 

76 

72 

60 

92 

69 

88 

75 

72 


26 

26 

■27 

28 

29 

30 

39 

12 

11 

07 

72 

20 

03 

38 

97 

12 

87 

15 

57 

51 

31 

12 

50 

82 

52 

22 

24 

73 

89 

09 

31 

35 

59 

02 

23 

84 

03 

71 

82 

60 

44 

48 

16 

56 

57 

02 

46 

13 

87 

56 

80 

11 

02 

46 

33 

69 

90 

40 

59 

83 

33 

47 

40 

14 

70 

07 

88 

78 

35 

34 

55 

49 

95 

04 

05 

19 

52 

40 

62 

44 

72 

30 

09 

91 

13 

26 

25 

16 

55 

89 

79 

16 

26 

74 

55 

78 

59 

64 

26 

02 

36 

17 

14 

96 

63 

98 

71 

28 

88 

78 

96 

90 

90 

00 

49 

91 

95 

59 

60 

06 

38 

19 

28 

01 

63 

44 

34 

07 

71 

33 

49 

80 

52 

24 

53 

09 

84 

27 

76 

29 

85 

59 

84 

16 

35 

04 

27 

03 

98 

84 

36 

79 

99 

56 

05 

63 

63 

87 

15 

15 

27 

59 

61 

32 

54 

74 

63 

89 

69 

65 

15 

88 

82 

08 

84 

23 

05 

57 

14 

43 

87 

93 

20 

89 

37 

55 

20 

44 

52 

85 

28 

63 

36 

54 

02 

85 

92 

92 

72 

23 

80 

06 

83 

24 

91 

23 

41 

95 

06 

18 

50 

88 

21 

00 

24 

82 



INDEX 


x 


Numbers in italics indicate Exercises and are followed by their respective page 
references. 


Acceptable quality level (AtjL), 220 
Acceptance number, 220 
determination of, 10S, 260, 104, 261 
Acceptance sampling, 219 
Accuracy versus precision, 28 
Admissible estimator, definition, 202 
for general design, 202 
Aggregate size, sampling with pp to, 
93-96 

in double sampling, 147-148 
efficiency, 95-96 
estimate, 94 
sampling scheme, 93 
in two-stage sampling, 52, 241 


variance, 94 
strata of equal, 72 
Aggregates {see Total) 

Agriculture, need for information in, 20 


Ajgaonkar, S. G. P., 268 
Allocation of sample size in stratified 
sampling, 65 
equal, 71 

for estimating several items, 77, 21 , 
232 

complete set of allocations, SO, 234 
N-proportional, 66 
optimum, 65 
Z-proportional, 67, 69 
Almost unbiased ratio estimator, 86, 235 
Antle, C. E>, 265 

Arbitrary probabilities, selection with, 51 
Area sample, definition, 219 
used with list sample, 219, 89, 254, 96, 
257 

Attributes, sampling for (see Proportions) 
Autocorrelated populations, 211 


288 


Auxiliary information, use of, 85 
in difference estimation, 99 
multiauxiliary, 102 
in pps sampling, 49 
in ratio estimation, 86 
in regression estimation, 100 
Availability at home, probabilities based 
on, 183 

Politz and Simmon’s method, 183- 
184 

with s calls per sample unit, 74 , 249 
Avdhani, M. S., 268, 269 
Average outgoing quality (AOQ), 220 
of double sampling plan, 222 
limit of, 220 

of single sampling plan, 220 
Average sample number (ASN), 220 
for double sampling plans, 222 
for single sampling plans, 220 
Ayoma, H., 72, 84 


Basu, D., 263 

Best linear estimator, 201-203 
existence of admissible, 202 
nonexistence of uniformly unbiased 
201-202 

Best weight function for combination of 
variables, 16-17 

Bias, effect on confidence intervals 
29-30 

of estimate of standard deviation, 83 
252 

estimation of, due to noninterview 78 
72, 248 ’ ’ 

of ratio estimators, 86-87, 35, 235 
due to recall lapse, 185, 80, 251 
by reinterview, 178, 78, 250 
of response, 167 
in selection, 26 

of simple estimator, in cluster sam¬ 
pling, 112, 11, 228 
of variance, 6, 227 
use of biased estimators, 28-29 
variance of < stimate of, 77, 250 
Birnbaum, Z. W., 267 
Blackwell, D. f 256, 269 

Boundaries of strata, determination of, 
69-73 

Dajenius-Hodges rule, 72 
Ekman’s rule, 72 
under equal allocation, 71 


■■•eury 

Boundaries of strata, determinate 
under optimum allocation ,' 
under proportional allocate’ 

strata of equal aggregates””' “ 

for two-dimensional stratify 73 
29 , 233 lfica 'tion, 

Boundary effect, bias produced v, 
Bounds of MSB of ratio estimate^ 


Brewer, K. R. W., 269 
Bryant, E. C., 206, 222 
Bryson, M. R., 268 


Call-backs, 182 
Cauchy’s inequality, 17, 65 

Census, followed by sample checks 178 
versus sample survey, 21 
Centered systematic sample, 16 , 229 
Central-limit theorem, 18 
use in sampling work, 28 
Chakravarty, I. M., 202, 223, 263 
Change, estimation of, 157-158 
best estimate of, 157 
under given condition, 61, 245 
simpler estimate of, 158 
Changes in selection probabilities, adjust¬ 
ment for, 204-205 
Characteristics of population, 21 
Classes, estimation of number of, 101, 
259 

Cluster sampling, compared with sam¬ 
pling elements, 109 
estimation of efficiency, 111-112 
estimation with equal clusters, 108-110 
estimation of proportions, 110-111 
estimation with unequal clusters, 
112-113 

optimum size of cluster, 109-110, 

238 

reasons for, 107-108 
subsampling of clusters (see Multi¬ 
stage sampling) 
of two units per cluster, 53, 241 
Cochran, W. G., 72, 84, 133,137, 

159, 161,. 163, 211, 222, 265, 
Coefficient, of correlation (see Corre a 
coefficient) 

of variation, definition, 36 
sample size, 36 
use in estimation, 90, 254 
Collapsed strata, method of ( 





INDEX 


219 


Collection of information, by interview 
24 

by mail, 24 

on probability basis, 94, 255 
Combined ratio estimate, compared with 
separate, 105 
mean square error, 105 
short cut to computation of variance 
41, 237 

in two-stage stratified sampling, 128 
unbiased, 26, 233, 28, 233, 40, 237 
upper limit to bias, 105 
Complementary events, 2 
Complementary pairs, systematic sam¬ 
pling with, 14 , 228 
Complete enumeration versus sample 
enumeration, 21 

Complete set of allocations to strata, 80, 
234 

Components of variance, between and 
within clusters, 110 
for response errors, 170-171 
Concentration of estimates around 
mean, 28 

relationship to variance, 14 
theorems on, 14, 18 
Conditional expectation, 13 
Conditional probability, of events, 2 
of random variables, 3 
Conditional variance and covariance, 14 
Conditioning of respondents, 185 
Confidence intervals, 29, 87 
effect of bias on, 29 
Consistent estimates, 27 
Constant bias, effect on estimators, 68, 
247 

Controlled selection, definition, 214-215 
of more than two units, 93, 255 
similarity with purposive selection, 
214-215 

with stratification, 76 
of two units, 92, 255 
Cornfield, J., 83 

Correlation coefficient, definition, 10 
effect on ratio estimates, 92 
intraclass, 16 
limits of, 10 

use in sampling over time, 155, 160 
Correlogram, 214 

Cost function, for best sampling fraction 
for nonrespondents, 80 

in double sampling, 141 


Cost function, in estimating parametric 
functions, 77 

general considerations, 30 
for optimum cluster size, 48 , 238 
for optimum probabilities of selection, 
129-130 

in stratified sampling, 65 
in two-stage sampling, 117 
Couvelis, P., 27ln. 

Covariance, definition, 10 
formula for, 14 
interviewer covariance, 171 
of linear functions, 12 
of sampling mid response deviations, 
180 

Cumulation of sizes for pps selection, 47, 
51 

Cumulative rule for strata bound¬ 
aries, 72 

Current estimates, 153 
for sampling, on several occasions, 
161-162, 63, 245 
on two occasions, 154, 156, 160 
Cutoff point for skew populations, 98, 258 

Dalenius, T., 70, 72, 84 
Das, A. C., 264 

Degrees of freedom in stratified sam¬ 
pling, 193-194 

effect on variance estimate, 193 
Deming, W. E., 268 
De Pascual, J. N., 265 
Dependent random variables, 3 
illustration of, 6 

Difference estimation, definition, 99 
in double sampling pps selection, 145- 
147 

in double sampling simple random, 
140-142 

in sampling over two occasions, 153 
in single phase sampling, 99 

comparison with simple average, 100 
estimate and variance, 99 
use with several x-variates, 102 
Distinct units, comparison of wtr and 
wr schemes, 40, 17, 230 
estimator based on, 40 
in pps sampling, 203, 96, 256 
in random sampling for n distinct 
units, 1, 225 

in simple random wr sampling, 40 
in two-stage sampling, 66, 242 




290 

Dodge, H. F., 222, 270 
Domains of study, estimation for, 199- 
201 

in stratified sampling, 200 
in unstratified sampling, 199 
Double sampling, 139-152 
for biased ratio estimation, 148-149 
difference and ratio estimates com¬ 
pared, 149-150 

for difference estimation, with several 
z-variates, 58, 243 
in simple random sampling, 140-142 
in multistage sampling, 69, 244 
of nonrespondents, 78-80 
for pps estimation, 142-145 

compared with simple random, 143- 
144 

with pps selection, 145-147 
compared with single sampling, 147 
for regression estimation, 150-151 
compared with single sampling, 151 
for stratification, 151-152, 57, 243 
compared with single sampling, 152 
estimator of variance, 57, 243 
technique of, 139-140 
for unbiased ratio estimation, 147-148 
for unbiased ratio-type estimation, 

60, 244 

Double sampling plans for inspection, 
221-222 

Duplicate units, in list and area samples, 
219, 89, 254, 96, 257 
in two lists, 87, 253, 88, 253 
Duplication of units to preserve sample 
size, 185, 81, 252 

Duplications in a frame, avoidance of, 218 
estimation of extent, 101, 259 
Durbin, J., 217, 264, 266, 269 


Eckler, A. R., 162, 163, 267 
Ekman, G., 72, 84 
El Badry, 267 

Elementary units, compared with cluster 
sampling, 109, 111-112 
sampling of, 107-108 
Equal aggregate size, strata of, 72 
Equal probability, sampling with, 33 
us;ng difference estimate, 99 
using ratio estimate, 86 
using regression estimate, 100 
using simple unbiased estimate, 35,38 


S * mpun °t„eo rv 

Error, mean square, 29 
Errors, of misclassification, 78 
of response, 166 ’ 00 

sources of, 185 

Essential conditions of survey I67 
Estimation, of mean, 35 
of proportion, 42 
of ratio, 85 
of total, 36 
Evans, W. D., 83 
Events, complementary, 2 
definition, 2 
independent, 2 
intersection of, 2 
mutually exclusive, 2 
union of, 2 

Execution of surveys, 25 
Expansion factor, 36 
Expectation (see Expected value) 
Expected amount of inspection, 22^ 
for double sampling plans, 222 
for single sampling plans, 220 
Expected sampling fraction, 124 
Expected survey value, 167 
Expected value, conditional, 9 
definition, 7 
of linear combination, 8 
of product, 8, 9 
inequality for, 11 
of sum, 8 

theorem using conditional, 13 
Extraneous units in frame, 218 


Fellegi, I. P., 264 
Feller, W. G., 2, 18 
Field work, organization of, 25 
Finite population, definition, 21 
Finite population correction, 36 
First-stage sampling unit, 108 
Fisher, R. A., 214-222 
Folks, J. L., 265 

Fourth moment for variance estimation, 


190 

Frames, definition, 23 
imperfect, 218, > 

sampling from two frames, 89, 254 
use of list and area sample, 219, 96, 


„ 240 

Gain, from stratification, 75-76, 4 > 
from wtr sampling, 122-123 








INDEX 


General procedure of forming estimators, 

Ghosh, S. P., 265 
Glosser, G. J., 268, 269 
Godambe, V. P., 201, 222, 266, 269 ' 

Goodman, L. A., 12, 18, 86, 96 98 ina 
265, 268, 269 ’ 106 ' 

Goodman, R., 76, 84 

Greece, employment survey, 271—281 

household survey, 22 

Grouping strata to estimate variance, 74 
Grundy, P. M., 55, 60, 264 

Hajek, J., 269 

Hansen, M. H., 13, 18, 79, 83, 84 117 
130, 136, 167, 179, 182, 186, 191 
223, 265, 267, 268, 269 * 

Hanurav, T. V., 268 

Hartley, H. O., 86, 97, 98, 106, 132, 133 
“J 183 ' lsf >. M7, 222, 223, 

Helger, B., 27l n . 

Hodges, J. L., 72, 84 
Horvitz, D. G., 52, 53, 60, 198, 223 
Horvitz-Thompson’s estimator, 52 
Household survey of Greece, 22 

Hurwits, W. N., 13, 18, 79, 84, 130, 136, 
268, 269 ' 


Identically distributed variables, theo¬ 
rem on, 38 

Identifiability of units, use in sampling, 

Imperfect frames, sampling from, 218- 
219, 89 , 254, 96 , 257 
Incomplete lists, sampling with, 219 
Inconsistency of classification, index of, 
182 

Independent events, 2 
Independent random variables, 3 
illustration of, 6 

Independent samples, use in double 
sampling, 141 

for difference estimation, 141-142 
for pps selection, 144-145 
with pps selection, 147 
Independently distributed random 
variables, theorem on, 38 
Industry, need for information in, 20 
Inflation factor, 36 


Information, need for collection of, 19 
|n agriculture, 20 
in industry, 20 
on internal trade, 20 
on labor, 20 
on population, 19 
Infrequent items, sampling for, 43 
Inspection by sampling, 219-222 
double sampling plans, description, 

expected amount of inspection, 222 
operating characteristic, 221 
outgoing quality, 222 
smgle sampling plans, expected 
amount of inspection, 220 
operating characteristic, 219 
outgoing quality, 220 

Interchangeable random variables 18 
230 ’ ’ 

Internal trade, need for information 
on, 20 

Interpenetrating subsamples, 171 
estimate and variance, 120, 168 
scheme, 119 - 120 , 171 

stability of variance estimator, 190 - 
191. 

in stratified sampling, 191-193 7/ 

248 * ’ 

Intersection of events, 2 
Interviewer covariance, 171 
Interviewers, bias due to, 166 
effect on variance, 171 
optimum number of, 172 , 70 247 
Intraclass correlation, definition 16 
within interviewer assignments, 170 
used in cluster sampling, 110 
used in systematic sampling, 45 
Inverse matrix, use for finding best 
weights, 17 6 

Inverse sampling with pp to size 18 23n 

eatimatfoa in two-stage sampling, U , 
scheme, 18 , 230 

Jabine, T., 269 

Jessen, R. J., 159, 163, 222 

Job of sampling statistician 19 

mJtSn^V variabl “’ 3 



292 


SAMPLING 


Joint estimation of current mean and 
change, 61 , 245 
Joshi, V. M., 269 
Judgment sampling, 26 


Keyfitz, N., 204, 205, 223, 266 
Khamis, S. EL, 40, 60, 263 
Kish, L., 76, 84, 178, 186 
Koop, J. C., 203, 223 
Koopmans, T. C., 204, 223 
Kulldorff, G., 267 

Kurtosis (see Fourth moment for vari¬ 
ance estimation) 


Labor, need for information in, 20 
Labor force survey in Greece, 271-281 
efficiency, 278-281 
precision, 276 
quality of results, 277-278 
sample design, 272-275 
Lagrange’s undetermined multipliers, 
method of, 129 

Lahiri, D. B., 48, 60, 93, 106, 144, 163, 
204, 223 

Lansing, J. B., 178, 186 
Latin square design, compared with one¬ 
way stratification, 83-84 
estimation, 82, 84 
use with small samples, 206 
use in stratification, 81 
Law of large numbers, 18 
use in sampling work, 27 
Limit theorems, 18 
central-limit theorem, 18 
law of large numbers, 18 
Linear combination of variables, best 
weights for, 16 
expected value of, 8 
variance and covariance, 11, 12 
Linear estimators, definition, 27 
nonexistence of uniformly best, 202 
Linear programming, changing selection 
probabilities, 204-206 
determination of optimum proba¬ 
bilities, 8 , 227 

method of overlapping maps, 203-204 
Linear regression estimate (see Regres¬ 
sion estimate) 

Lists, sampling from imperfect, 218- 
219, 89, 227 


THEORY 

Lists, used with area samples. 2lQ 
257 

Loss'function, 29 

use in estimating several characters 
77 

Lot tolerance percent defective 
220 



Mackenzie, W. A., 214, 222 
Madow, L. H., 211, 223 
Madow, W. G., 13, 18, 84, 136, 211, 

223, 264 

Mahalanobis, P. C., 73, 84, 117, 137, 
171, 186 

Mail surveys, 78 

and interview of nonrespondents, 78 
several mail attempts, 73, 248 
Margin of error, 21 
Marks, E. S., 269 

Matching of sample to previous occa¬ 
sions, 153 

Matching errors, 76, 249, 76, 250 
Matching lists by samples, for two 
combined lists, 88 , 253 
for two separate lists, 87, 253 
Mathematical expectation, definition, 7 
theorems on, 8-9 

Mathematical model, for correlations 
between observations on several 
occasions, 160 

for finite population as sample from 
infinite population, 4 , 226 
for obtaining MSE of ratio estimate, 
91 

for response errors, 166-167, 175 
for studying recall lapse, 80, 251 
Maximum likelihood estimation, nonuse 
in sampling work, 30 
Maynes, E. S., 267, 268 
Mean, arithmetic, 21 
harmonic, use in pps sampling, 4 , 226 
Mean square error, definition, 29 
justification for use of, 29 
of mean with response errors, 171 
of ratio estimate, 89 
relation to variance and bias, 29 
Measures of size, best values of, 49, 
as used in pps sampling, 47 
Memory failure (see Recall * a P s ^ 2 o4 
Method, of overlapping maps, 2 
of random groups, 194-195 










INDEX 

Mickey, M. R., 265 
Midzunmo, H., 93, 106 
Minimum-variance unbiased (MVU) 
estimator, best weights for, 16 
conditions for existence of, 17 
definition, 17 

in sampling on several occasions, 
160-162 
theorem on, 17 
variance of, 18 

Misclassification, errors of, 182 

effect on estimation of proportion, 
78, 250 

Moments, central, 190 
use in variance estimation, 190-193 

Mudholkar, G. S., 270 

Multiauxiliary information, with differ¬ 
ence estimator, 102 
in double sampling, 68 , 243 
effect of augmentation of z-variates, 
104, 39, 237 

with ratio estimator, 104 
use of, 102 

Multiphase sampling (see Double sam¬ 
pling) 

Multiple stratification, 81, 206 

Multistage sampling, psu’s selected with 
equal probabilities, 113-117 
advantages of, 108 
formation of estimators, 114 
sampling and subsampling fractions, 
117 

unbiased estimate, 116 
variance and its estimation, 114—116 
psu’s selected wr with unequal prob¬ 
abilities, 119-121 ^ 

compared with wtr sampling, 122— 

123 

estimation of ratios, 127 
estimator based on distinct sub¬ 
units, ^6, 239, 66 , 242 
sampling and subsampling frac¬ 
tions, 129-130 

self-weighting estimator, 120-121 
unbiased estimator, 120-121 
variance and its estimation, 120 
psu’s selected wtr with unequal prob¬ 
abilities, 117-119 
estimation of ratios, 125 
rule for estimating variance, 215- 
217, 66 , 243 


291 

Multistage sampling, psu’s selected wtr 
with unequal probabilities, un¬ 
biased estimator, 118 
variance and its estimation, 118— 

119 

stratified, 123-124 

choice of optimum probabilities, 
130-131 

estimation, of ratios, 128-129 
of totals, 123 

self-weighting estimator, 124 
variance of estimator, 123 
useful designs, 132-136 

one psu, per randomized sub¬ 
stratum, 133-134 
per stratum, 132-133 
psu’s selected with pps of remain¬ 
der, 135-136 

randomized systematic sampling, 

132 

Multivariate estimation, difference 
method, 102 
ratio method, 104 
Murthy, M. N., 264, 265, 268 
Mutually exclusive events, 2 
Mutually independent variables, limit 
theorems on, 18 

Mutually uncorrelated variables, 
variance of sum of, 11 


JV-proportional allocation, 66 
Nanjamma, N. S., 265, 266 
Narain, R. D., 56, 60, 266 
National Statistical Service of Greece, 
22, 31, 271, 275 
Neter, J., 185, 186, 267, 268 
Neyman, J., 66, 84 
Neyman’s allocation, 65-66 
best stratum boundaries for, 71 
Nonnormality, its effect on variance 
estimation, 193 

Nonpreferred samples (see Preferred 
samples) 

Nonprobability sampling, 26 
Nonresponse, bias produced by, 78 
generalization to several strata of 
nonrespondents, 73, 248 
increased sample for estimating pro¬ 
portion, 72, 248 
need for handling, 25 


Noar«poB«. optimum 

tions for nonrespondents, 79-oU 
Polit* and Simmon’s method, 1M, 

74, 249 

Nonsampling errors, 165-186 
estimation in presence of, 168 17U 
use in interpenetrating subsamples, 
171 

Normal distribution, use m sampling 
work, 28-29 

Not-at-homes (see Availability at home) 

Number, of classes in population, esti¬ 
mation of, 101, 259 
of degree of freedom, 193-194 
of names common to two lists, esti¬ 
mation of, 87, 253 
estimation from combined lists, 

88, 253 

of strata, 73-74, 84, 232 


Olkin, I., 104, 106, 265 
One psu per randomized substratum 
design, 133-134 
per stratum design, 132-133 
Operating characteristic, of double sam¬ 
pling plans, 221 
of single sampling plans, 219 
Optimum allocation of sample, com¬ 
parison with proportional, 66-67 
in double sampling for regression, 151 
effect of deviations from optimum, 

88 , 232 

for estimating several items in survey, 
80, 234 

for estimating several parametric 
functions, 77-78 

for interviewing nonrespondents, 80 
for minimum variance variance- 
estimator, 80, 231 

in stratified sampling for estimating 
proportion, 68 

in stratified simple random sampling, 
65-66 

in stratified unequal probability sam¬ 
pling, 68-69 

Optimum number of enumerators, 
determination of, 171-172 
with stratification, 70, 247 
Optimum percent to match in sampling, 
on several occasions, 161-162 
on two occasions, 155, 157-159 


sampunq THE0Ry 

Optimum probabilities of psu’a d et 
mination of, 130-131 
Optimum size of sampling unit l0ft_ 
110, 48, 238 

Optimum weights for linear combina¬ 
tion of variables, 16-17 

Ordered estimator, 57-59, 18, 228 
compared with unordered, 13 t 228 
in multistage sampling with unequal 
probabilities, 50, 241, 61, 241 
Osborne, J. G., 214, 223 
Overall sampling fraction, 124 
Overlapping maps, method of, 203-204 


Pairing of strata, 74-75 
Parametric functions, estimation of, 77 
minimization of cost, 77 
minimization of cost plus loss, 77, 
81, 232 

minimization of variances, 78 
Partial replacement of sample in sam¬ 
pling overtime, 155 
Pathak, P. K., 264 
Patterson, H. D., 156, 163 
Peakedness, effect on variance estima¬ 
tion, 191 

Percentages (see Proportions) 
Periodicity, effect on systematic sam¬ 
pling, 209-210 

Personal questions, answered on proba¬ 
bility basis, 94, 255 
Pilot survey, 25 

Planning of surveys, steps involved in, 
22 

Politz, A. N., 183, 187 
Politz and Simmon’s method, 183-184 
Population, need for collecting informa¬ 
tion on, 20 

Populations, autocorrelated, 211 
\ finite, 21 

with linear trend, 210 
with periodic variation, 209-210 
in random order, 46 
sampled, 23 
target, 23 

Precision of estimators, 28 
sample size for specified, 36, 43 
Preferred samples, selection of, 76, 98, 
255, 98, 255 

Pretest of questionnaires, 25 


INDEX 


primary sampling units, 108 
pritzker, L., 268 

probabilities, of inclusion in sample 
relation among, 54 
optimum, for selection of psu’s, 130- 
131 

probability, conditional, 2 
of events, 2 

of intersection of events, 2 
limits with Tchebycheff inequality, 

of union of events, 2 

probability distribution of variables, 3 
illustration of, 5 

Probability proportionate to aggregate 
size, 93-96 

compared with pps sampling, 96 
in double sampling, 147-148 
estimate and variance, 94 
sampling scheme, 93 
in two-stage sampling, 62, 241 
Probability proportionate to size, multi¬ 
stage wr sampling of psu’s, 119- 
121 

estimate and variance, 120 
estimation of ratios, 127 
improved estimator, 66, 242 
for wtr sampling of second-stage 
units, 46, 239 

multistage wtr sampling of psu’s, 
117-119 

comparison with wr sampling, 
122-123 

estimate and variance, 118 
for estimation of ratios, 125-126 
with one psu, per randomized sub¬ 
stratum, 133-134 
per stratum, 132-133 
with psu's selected with pps of 
remainder, 135—136 
with randomized systematic sam¬ 
pling of psu’s, 132 
in stratified sampling, 123 
single-stage wr sampling, 47-50 
admissible estimator, 203 
compared with equal probability 
selection, 50, 4> 226, 6, 226 
compared with ppas sampling, 96 
estimation procedures, 48-49, 16, 
229 

methods of selection, 47—48 


Probability proportionate to size, single- 
stage wtr sampling, 50-60 
comparison with wr sampling, 55- 
57 

necessary condition, 56-57 
sufficient condition, 56 
estimation procedures, 52-55 
Horvitz-Thompson’s estimator, 
52-53 

Raj’s estimator, 57-59 
Yates-Grundy’s estimator, 55 
selection procedures, 50-52 
with arbitrary probabilities, 51 
inverse sampling, 18, 230 
with pps of remainder, 50 
with systematic sampling, 51-52 
simpler variance estimator, 6, 227 

Probability sampling, definition and 
properties, 26 

Product of variables, expected value of, 

9 

inequality for, 11 

variance for dependent variables, 84, 
252 

variance for independent variables, 

12 

Product estimation, 106, 261 
with multiauxiliary information, 106, 
261 

Proportional allocation (iV-proportional), 
66 

compared with optimum allocation, 66 
compared with stratification after 
selection, 22, 232 

compared with unstratified sampling, 
66-67 

compared with X-proportional allo¬ 
cation, 67-68 

self-weighting estimate, 67 
strata boundaries for, 70 
Proportional allocation (X-proportional), 
67 

compared with AT-proportional allo¬ 
cation, 67-68 

compared with unstratified sampling, 
69 

Proportions, estimation of, 21 
in cluster sampling, 110-111 
in presence of nonresponse, 72, 248 
in presence of response errors, 181, 
69, 247 

sample size needed for, 42 



296 

Proportions, estimation of, in simple ran- 
dom sampling, 42 
in stratified sampling, 63-64, 68 

Purposes of survey, 22 
Purposive selection, 214 
and controlled selection, 214-215 

Quadratic estimator of total, 97, 258 
Quality check of information, 25 
Quennouille, M. H., 238 
Quennouille’s ratio estimator, 4%> 238 
Questionnaires, construction of, 25 
pretest of, 25 

Quota sampling, definition, 26n. 

Quotient of variables (see Product of 
variables) 


Raising factor, 36 

Raj, D., 40, 50, 55, 58, 60, 69, 73, 77, 84, 
90, 92, 94, 96, 102, 106, 121, 122, 
132, 134, 135, 137, 140, 142, 145, 
148, 155, 163, 165, 178, 187, 190, 
195-197, 204, 216, 223, 263-266, 

268, 276, 281 

Ramanathan, R., 267, 268 

Random experiment, 1 

Random function, 2 
illustration of, 5 

Random groups, method of, 194-195 

Random numbers, 284-285 

Random sampling, with replacement, 

33 

without replacement, 33 

Random substitution to achieve weight¬ 
ing, 81, 252 

Random variables, definition, 2 
expected value of, 7 
independence of, 3 
probability distribution of, 3 
variance of, 9 

Randomized response, method of, 9 A 
255 


andomized systematic sampling, 51 
extension to multistage sampling, 12 
Rao, C. R,, 18, 15 6> 163) 25 6 

4 N 2M - W2 ’ 133 ’ 1M ' 137 > 

Rao . P- S. R. s., 270 


SAMPL, Na TH£o 


Rao-Blackwell theorem, 95 f 255 
Rare items, sampling for, 43 
Ratio-estimate, 85-98 
accuracy of approximate variant 
90-91, 32, 235 ’ 

adjustments to decrease bias, 42 
almost unbiased, 35, 235 ’ ’ ™ 

based on average of ratios, 97 
based on pp to aggregate size 93 
bias of, 86-87, 107, 262 
bounds on mean square error, 90—91 
compared with simple average, 91-92 
in double sampling, 147-149 
effect of near proportionality on 92 
efficiency, 91-92 
estimated variance, 93 
first approximation to variance, 89 
general methods of generating, 37 
236, 38, 236 

mean square error of, 89, 107, 262 
multivariate, 104, 39, 237 
reason for using, 85-86 
reduction of bias of, 42 , 238 
as special case of general, 101 
in stratified single-stage sampling, 
104-105, 26, 233, 40, 237 
in stratified two-stage sampling, 128 
upper bound to relative bias, 87, 107, 
262 

Ratio-type estimator, based on inter¬ 
penetrating subsamples, S3, 235 
comparison with ratio estimate, 98, 
34, 235 

in double sampling, 60, 244 
unbiased, 97 
Ratios, population, 21 
Recall lapse, adjustment for, 80, 251 
Reference period, 24-25 
Regression coefficient, definition, 99 
Regression estimate, 100-102 
compared with other estimates, 101 
definition, 100 
in double sampling, 150 
general method of generating, 37, 

236, 38, 236 

in repeated sampling of same pop u a 
tion, 159-160 

as special case of general, 101 
variance of, 100, 36, 236 
Re-interviews for estimating bias, 

77, 250 . l8 2 

for estimating response varian ce > 





INDEX 


Repeated sampling of the same popula¬ 
tion, 152-163 

estimates of change, 157-158 
estimates of sum, 158-159 
estimation using past and current 
data, 162-163, 63-66, 245-246 
estimation using succeeding month 
data, 66, 246 

minimum variance current estimates 
156-157 

optimum percent matched, 153 
for change, 158 
for current estimates, 157 
for sum, 159 

regression estimate, 159-160 
sampling on several occasions, 160-161 
sampling on two occasions, 153-160 
surprise stratum technique, 67, 246 
Repetitive surveys, problems involved 
in, 153 

Replacement, of sample, 153, 157-159 
sampling with, 33 
sampling without, 33 
Replicated sampling (see Interpene¬ 
trating subsamples) 

Response bias, definition, 167 
estimation of, 178, 77, 250 
Response deviation, 179, 79, 251 
Response errors, 165-186 
adjustment for recall lapse, 80, 251 
analysis with pps sampling, 178-179, 

79, 251 

analysis with stratification, 178-179, 

71, 248 

as arising from matching of records, 

75, 249, 76, 250 

compensating and uncorrelated, 176 
effect reflected in variance estimate, 
181, 69, 247 

in estimation of proportion, 181, 69, 
247 

examples of, 185-186 
index of inconsistency of classifica¬ 
tion, 182 

interviewer contribution to, 170, 175 
model assumed for, genera), 166-167 
restricted, 175 

use of interpenetrating subsamples, 168 
Response variance, definition, 180 
estimation of, 182 
simple response variance, 180 
upper limit for, 181 


217 

Restricted sampling designs, cluster 
sampling, 108 
Latin square, 81 
stratified sampling, 63 
systematic sampling, 43 
Risk function, minimization of, 82, 252 
Romig, H. G., 222, 270 
Root mean square error (see Mean 
square error) 

Ross, A., 264 

Rotation sampling (see Repeated sam¬ 
pling of the same population) 
Rounded off multipliers, 86, 253 
Roy, J., 202, 223, 263 
Rules for variance estimation in multi¬ 
stage sampling, Durbin’s r, 66, 243 
Raj’s r, 215-217 

Sampford, M. R., 266 
Sample, versus census, 21 
pps, 47 
random, 33 
systematic, 43 

Sample design, definition, 31 
guiding principle of, 31 
Sample points, 1 
Sample proportion, 42 
variance in simple random sampling, 
42 

Sample size, allocation in stratified 
sampling, 65-66 

determination of (see Size of sample) 
Sample space, 1 
Sampled population, 23 
Sampling, with probabilities propor¬ 
tionate to size (see Probability 
proportionate to size) 
with replacement, with equal proba¬ 
bilities, 33 

estimator based on distinct units, 40 
general procedure of estimation, 39 
without replacement, with arbitrary 
probabilities, 51 
compared with wr sampling, 39 
17, 230 

with equal probabilities, 33 
general procedure of estimation, 39 
inverse sampling, 18, 230 
systematic selection on cumulated 
sizes, 51 

with unequal probabilities, 50 




298 


Sampling, over several occasions, 1 GO- 
162 

current estimate, 161 
replacement policy, 161-162 
variance of estimate, 161 
over two occasions, 153-160 
for current estimates, 153-157 
MV estimation, 156 
with pps selection, 155 
replacement policy, 155 
for current mean and change 61 
245 

for estimates of change, 157-158 
MV estimation, 157 
simpler estimate, 158 
for estimates of sum, 158-159 
MV estimates, 159 
simpler estimate, 159 

sampling fractions, choice of. 62 
245 

with unequal probabilities, 47 
Sampling deviation, 179 
correlation with response deviation 
180 

Sampling error, 26-27 
Sampling fraction, 35 
Sampling inspection plans, 219-222 
double, average outgoing quality, 222 
description, 22 

expected amount of inspection, 222 
operating characteristic, 221 
single, average outgoing quality, 220 
description, 219 
> determination of parameters, 

103, 260, 104, 261 
expected amount of inspection, 220 
operating characteristic, 219-220 
Sampling interval, 43 
Sampling method, role of, 22 
Sampling schemes, pps, 47 
simple random, 33 
systematic, 43 

Sampling statistician, job of, 19 
Sampling unit, choice of, 48, 238 
definition, 23 

Satterthwaite, F. F., 193, 223 
Searls, D. T., 268 

Second-stage units, definition, 108 
^ fixed or random number, 48, 240 
Selection probabilities, adjustment for 
changes in, 204-206 


UNe T| *0 rJ 

Self-weighting estimate, ad**. 

67 Vant age Sof D 

m stratified sam p li ng> 6 7 

in two-stage sampling 10 

Sen, A. R., 264 8 ’ l6 ’ 130 

Separate ratio estimate 105 
compared with combined ro f 
mate, 105 mtl ° fi¬ 

llability to bias, 105 
^ mean square error of, 105 
Sequence of random variahl™, 

limit theorem, 18 ' Central * 

law of large numbers 18 

Seth, G. R., 264 * 

Sethi, V. K., 266, 268 
Several items, estimation of, 77 
sample allocation for minimization 
of cost, 77 
plus loss, 77 
of variances, 78 

of weighted sum of variances, 30 
234 

Short-cut in variance estimation with 
two units per stratum, 41 , 237 
Shortest path among m random points 
100, 259 


Simmons, W. R., 183, 187 
Simple average, comparison with ratio 
estimate, 92 
definition, 35 

Simple random sampling, 33-43 
compared with stratified random 
sampling, 66-67 

estimated variance of mean, 37, 38 
estimated variance of proportion, 42 
method of selection, 33 
sample size needed, for mean, 36 
for proportion, 43 
variance of mean, 35, 38 
variance of proportion, 42 
Simplex method, 204 
Singh, D., 264 

Single sampling inspection plans, aver¬ 
age outgoing quality, 220 
description, 219 

determination of parameters, 108, 


260, 104 , 261 

expected amount of inspection, 220 
operating characteristic, 219-220 
Sirken, M. G., 267 


1 






\ gjae, measures of, 49 

of sample, determination of, in dou¬ 
ble sampling, 141, 151 
in sampling over two occasions, 
62, 245 

in simple random sampling, 36, 
43 


in stratified sampling, 66 
of sampling unit, choice of, 43, 238 
Skew populations, sampling from, 67-68 
> location of cut-off point, 98, 258 
Bkfewness, coefficient of, 91 
itimall plot bias, 186 
i» om, R. K., 268 * 

i,. >urces of error in surveys, 185-186 
Stability of variance estimation, 189- 


199 

based on interpenetrating subsam¬ 
ples, 190, 193 

connection with number of degrees of 
freedom, 193-194 
an example, 192 

for quick variance estimation, 192-195 
in randomized pps systematic sam¬ 


pling, 197-198 

sample allocation to strata for, 20, 231 
in stratified wr pps sampling, 193-194 
in wr pps sampling, 193 
in wr sampling with equal probabili¬ 


ties, 190-191 
in wtr pps sampling, 196 
i Stages of sampling (see Multistage sam- 
I pling) 

! Standard deviation, biased estimate of, 
83, 252 
definition, 10 

Standard error {see Variance) 
Standardized normal variable, 18 
Stephan, F. F., 265 
Steps in a sample survey, 22-25 
Strata, collapsed, 74 


construction of, 69 

of equal aggregate size, 72-73, 25, 232 
optimum number of, 73, 24, 232 
ratification, best variable for, 69 
with controlled selection, 76 
with double sampling, 151-152 


Latin square, 81 _ 

number of degrees of freedom, 194 


194 

number of strata, 73 
reason for, 61 


Stratification, after selection of sample, 
22, 232, SI, 234 
for skew populations, 67-68 
with small samples, 81 
in two dimensions, 81 
Stratified sampling, 61-84 

allocation of sample for stable vari¬ 
ance estimation, 20, 231 
allocation in unequal probability sam¬ 
pling, 68-69 

compared with unstratified random 
sampling, 66-67 
construction of strata, 69 
dependent selection, 76 
estimation of mean and total, 63-64 
estimation of proportion, 63-64 
estimation of several items, 77, SO, 234 
estimation of several parametric 
functions, 77 

gain from stratification, 75-76 
general formula for variance, 62 
involving controlled selection, 76 
in multistage sampling, 123-124 
with one unit per stratum, 74 
opti mum allocation, 65 
proportionate allocation, 66-67 
for proportions, 68 
with ratio estimates, 104-105 
sample size needed, 66 
with self-weighting estimator, 67 
Stratum of nonrespondents, 78-80 
Stratum boundaries (see Boundaries of 
strata) 

Stuart, A., 65, 84 

Subgroup, increased representation to, 
99, 259 

Subpopulations, analysis for, 199-201 
in simple random sampling, 199 
in stratified sampling, 200 
in two-stage sampling, 201 
(See also Domains of study) 
Subsampling, of nonrespondents, 78-80 
(See also Multistage sampling) 

Subset, estimation for (see Domains of 
study) 

Substitution to accomplish weighting, 

81, 252 

Successive occasions, sampling on (see 
Repeated sampling of the same 
population) 

Successor-predecessor method, 96b, 257 
Sufficient partition, 95, 256 



300 

Sufficient statistic, definition, 95, 256 
order statistic as sufficient, 96a, 256 
use in improving estimators, 96b, 256 
Sukhatme, B. V., 267, 268, 269 
Sukhatme, P. V., 175, 186, 187, 266, 267 
Sum, of variables, expected value of, 8 
variance of, 11 

on two occasions, estimation of, 158- 
159 

replacement policy, 159 
sample size for, 62, 245 
Superpopulation, concept of, 211 
for comparison of pps and equal 
probability sampling, 4, 226, 6, 
226 

for comparison in stratified sam¬ 
pling, 27, 233 

for studying performance of sys¬ 
tematic sampling, 217 
Supplementary information, use of, 85 
in difference estimation, 99 
in pps sampling, 49 
in ratio estimation, 86 
in regression estimation, 100 
Surprise stratum technique, 67, 246 
Survey design (see Sample design) 
Symmetrical distributions, effect on 
MSE of ratio estimate, 91 
Systematic sampling, advantages, 43 
in autocorrelated populations, 211-214 
centered, 15, 229 

comparison with stratified and simple 
random sampling, 46 
in complementary pairs, 14, 228 
effect of periodic variation, 209-210 
estimation of variance, 46 
expected sample size and its variance, 
14, 228 

method of drawing, 43 
in populations, in random order, 46 

showing trend, 210-211 
with probability proportionate to 
size, 51 

relation to cluster sampling, 43-44 
sample estimate, 44 
in two-stage sampling, 119-121 
variance of the estimate, 44 , 45 


Target population, 23 
Taylor’s expansion, 88 
Tchebycheff inequality, 14 






Telescoping of events, 185 
Tests of questionnaires 9 .^ 
Thompson, D. J 52 Jo L 
Three-stage sampling {s^t^ 

Time reference of survey, 24 
Time series, sampling for 1 x 0 
Total of a population 21 ’ 163 

eattaatfon of, by double 

in multistage sampling ijg 
by ratio method, 86 
by regression technique, 100 
in simple random sampling 
in stratified sampling, 62 
Total error, minimization,’of 17l 
Training of interviewers, 25 ' 
Transportation problem, 204 
Trend, estimate of, 157 —159 
populations showing, 210 
True value, 165 

Truncated normal, use in stratification 
25, 232 ’ 


Tulse, R., 269 4 

Two-dimensional population, stratifies- 4 ! 

tion of, 29, 233 j. 

Two-phase sampling {see Double sam¬ 
pling) 4 

Two-stage sampling, psu’s selected with T 
equal probabilities, 113-117 
best values of sampling and sub¬ 
sampling fractions, 117 y* 

plan with total sample size random, 

48, 240 jl 

ratio estimation, 126 \ 

unbiased estimate, 116 j 

variance and variance estimator, { 


114-116 

psu’s seleeted wr with unequal proba¬ 
bilities, 120-121 
choice of optimum probabilities, 
130-131 

comparison of different subsam¬ 
pling schemes, 46, 239, 55, 242 
comparison with wtr sampling, 122 — 
123 


determination of sampling and sub¬ 
sampling fractions, 129-130 
estimation of ratios, 127 
unbiased estimator, 120 
variance and variance estimator, 
120 





INDEX 


Two-stage sampling, psu’s selected wtr 
with unequal probabilities, 117- 
119 

estimation of ratios, 125, 49, 241 
rule for estimation of variance 215- 
217, 56, 243 
unbiased estimator, 118 
variance and variance estimator 
118-119 
stratified, 123 
estimation of ratios, 128 
self-weighting estimator, 124 
variance of estimator, 123 
Two units per stratum, computation of 
variance, 41, 237 
Durbin’s method, 102, 260 
Fellegi’s method, 10, 227 
independent selections till units are 
different, 11, 228, 95, 256 
Raj’s method, 57-59, 135-136 
Two-way stratification, estimate, 82 
reasons for, 81 
selection procedure, 81 
with small samples, 206-209 
> estimate, 207 

selection procedure, 206 
“ variance, 208 
) use of Latin square, 81 
variance of estimate, 82 

Unbiased estimate, definition, 17 
example of nonexistence of, 91, 254 
with minimum variance, 17 
conditions for existence of, 17 
use in sampling work, 27-28 
Unbiased ratio estimation, 93-96 
in double sampling, 147-148, 60, 244 
general methods of generating, 37, 

236, 38, 236 

in stratified sampling, 26, 233, 28, 

233 

in two-stage sampling, 43, 241, 52, 

241 

Jnbiased ratio-type estimator, 96-98 
based on interpenetrating subsamples, 
S3, 235 

comparison with ratio estimator, 98, 
34, 235 

in double sampling, 60, 244 
estimate and variance, 97 
Unbiased selection, 26 


301 

Uncorrelated variables, variance of sum 
of, 11 

variance estimator, 60 
Unequal probability sampling, with 
replacement, 47 
without replacement, 50 
Uniform sampling fraction, 124 
Uniformly best linear estimator, non¬ 
existence of, 202 
Union of events, 2 
Unit of sampling (see Sampling unit) 

U. S. Bureau of the Census, 133 
Universe (see Population) 

Unordered estimator, as superior to 
ordered, 13, 228 

Unrestricted random sampling (see 
Simple random sampling) 

Variance, conditional, 14 
formula for, 14 
of linear combination, 11 
as measure of dispersion, 10 
of minimum variance unbiased 
estimator, 18 

of product, of dependent variables, 

84, 252 

of independent variables, 12 
of random variable, 9, 14 
of sum of random variables, 11 
Variance of difference estimate, in double 
sampling, for difference estima¬ 
tion, 140-142 

with pps selection, 145-147 
in sampling over two occasions, 154 
with several i-variates in simple 
random sampling, 102 
in wtr simple random sampling, 99 
Variance of pps estimate, in double 
sampling, 142 

in multistage wr sampling, 120 
in multistage wtr sampling, 118 
in selection, of one psu per randomized 
substratum, 134 

of psu’s with pps of remainder, 135- 
136 

in stratified controlled selection, 77 
in stratified two-etage wtr sampling, 
123 

in stratified wr sampling, 127 
in wr sampling, 48-49 
in wtr sampling, 52, 54 


902 


Variance of ratio estimate, in double 

sampling, for biased ratio estima¬ 
tion, 149 

for unbiased ratio estimation, 148 
in Politz and Simmons' method, 183- 
184 

in sampling with pp to aggregate 
size, 94 

with several x-variates in simple 
random sampling, 104 
in stratified random sampling, 105 
in stratified two-stage wr pps sam¬ 
pling, 128 

in two-stage wr pps sampling, 127 
in two-stage wtr random sampling, 

125 

in wtr simple random sampling, 89- 
90 

Variance of ratio-type estimate in simple 
random sampling, 97 
Variance of regression estimate, in 

double sampling for regression, 150 
in sampling over two occasions, 160 
in wtr simple random sampling, 100 
Variance of simple unbiased estimate, 
in cluster sampling, 108 
for domains of study, 199-200 
in double sampling for stratification, 

152 

in general sampling scheme, 40 
with interpenetrating subsamples, 38, 
193 

in Latin square sampling, 82 
in presence of response errors, 168 
in sampling over two occasions, 155, 

159 

in stratified wtr random sampling, 63 
in subsampling of nonrespondents, 79 
in systematic sampling, 44-45 
in two-stage sampling, 116 
in two-way stratification with small 
samples, 208 

in wr random sampling, 38 
in wtr random sampling, 35 
Variance of variance estimators, for 
independent variables, 189 
for method of randomjgroups, 194-195 


8AMPUN e «*<„, 


Variance of variance esti ma( _ . 
randomized pp. 

Plmg, 197-198 tlc ^Ca¬ 
in sampling with pns of - . 

196-197 P remf UQing, 

in stratified sampling, i 91 . w 
Variance estimation in multistage sam 
Plmg, rules for, 215-217 m 
Durbin’s rule, 56, 243 
examples, 217 ' 


Raj’s rule, 215-216 

Variation, coefficient of («, Coc ffldeilt 1 
of variation) j 

Varying probabilities, sampling with 1 

replacement, 47 

sampling, without replacement, 50 


Waksburg, J., 186 

Warner, S. L., 269 t 

Weight function, best, 16 
Williams, W. H., 265 
With replacement sampling (see Sam¬ 
pling with replacement) 

Without replacement sampling (see 
Sampling without replacement) \ 
Without replacement schemes better 
than with replacement schemes, 55l 
Fellegi’s scheme, 10, 227 
Raj’s scheme, 57-59 f 

Wold, H., 214, 223 
Woodruff, R. S., 162, 163, 267 


X-proportional allocation, 67 
use for skew populations, 67-68 


Yates, F., 55, 60, 77, 84, 156, 163, 264 
Yates and Grundy’s variance estimator, 
55 

situations when positive, 7, 227, 9, 227 


Zero function, definition, 17 
use in estimation, 17, 156 







“ * 



r <i 

7 -j“ ^ _ >* ^ *• 





• K . • • - * 

* - / ^ . \ - ^ / * > ^’yV '• A '- .SQ 


tV.".5^3E 

. i**! - 

^ * "" * .dhh . 

^JT. ffv^ V Xj 

SE^if* - 

























